WiFi (more likw WtFi) is an incredible advancement in technology.  It's become so ubiquitous that we don't really stop to marvel at its complexity, and instead get mad when it takes more than 10 seconds to "like" some dumb shit on the internet.  Imagine you walk blindfolded into a room with 10 other blindfolded people.  You all also sound exactly the same.  Now you need to find out how to communicate efficiently to a single listener in the room so that nobody else in the room knows what you are saying (except for the listener dude).  Yeah, have fun constructing that one by yourself.  Let's break this down into several problem areas.  

First, how can 10 blindfolded people in a room try to talk in turn?  Enter a technology called CSMA/CA, or Carrier-Sense Multiple Access with Collision Avoidance.  This is a multiple access technology that allows endpoints to communicate across a shared medium (air) by sensing when the channel is idle.  "Carrier" meaning the carrier wave, which is a radio wave emitting at either 2.4 or 5 GHz, depending on the devices used.  So, the endpoints sense the channel, allow for multiple access with other devices, and employs collision avoidance (rather than collision detection, which is another type of technology).  This works by using a series of back off timers before waiting to see if the channel is clear.  If the channel isn't clear, a random amount of time is passed before listening to the channel again.  If the channel is clear, the endpoint will emit a 'request to send' message out to the access point, and await a 'clear to send' response'.  If no response is received, the process starts again.  If the CTS is received, then the transmitting device will transmit some data and wait for an acknowledgement.  The data contained in the ack serves as an indicator that all of the data was received properly.  If no ack is received, then the transmission is assumed to be a failure, and begins again.  So back to the blindfolded room, it would be the equivalent of you listening for other voices, when you hear nothing, ask if you may speak, and the listener dude says you may speak, you say "'Like' this pile of dog shit on Facebook, please.  If you received this message, say 'flabbergalactica'".  If you don't hear 'flabbergalactica', then you wait, and try your stupid message again.  

 

The next amazing facet of WiFi is the 'language' that it uses.  You may have heard things like FM and AM in terms of radio, and these stand for Amplitude Modulation and Frequency Modulation.  What this means is that either the amplitude or frequency of the carrier wave frequency is modulated to some degree to encode data onto it.  If this sounds confusing, you can almost think of your voice when you speak to another person as using frequency modulation.  You adjust the pitch of your voice to encode certain information to someone else.  A question ends with a raise in pitch, anger is usually lower in pitch, etc.  This same principle applied to radio waves too, they just take place at an inaudible frequency.  Here's a great drawing on what FM and AM look like.  

 

 

This should give you a great understanding on what AM and FM look like in MS paint.  My above depiction aren't perfect sin waves, but I hope you get the general idea.  WiFi uses a modulation scheme called QAM, which stands for Quadrature Amplitude Modulation.  This uses 2 carrier waves, as opposed to only 1, that are modulated in both amplitude (as seen above), and also phase.  What is meant by being in or out of phase?  Here's another picture...

 In this case, the 2 waves that are the same frequency, but 180° out of phase, cancel each other out.  This is a simplified example, but if you have 2 waves that are 90° out of phase with each other, they are called quadrature carrier waves.  If the amplitude of these 2 waves are modulated, and then summed together, you get 1 wave that can be said to be modulated in both phase and amplitude.  This is essentially how QAM works to encode data.  The receiver analyzes the wave, and has corresponding list of binary data that each waveform can correspond to, based on its phase and amplitude.  These datasets are binary, and the receiver may say "if the wave is 135° out of phase and has an amplitude of 25%, this corresponds to binary "01".  If the wave is 45° out of phase with an amplitude of 75%, this corresponds to binary "10", and so forth.  This is how data is sent and received using QAM.  One issue that you have to account for in QAM is error correction.  Radio signals are prone to environmental deterioration like reflection, refraction, multi-path etc.  Think of it like when you scream inside of a cave, because you really haven't lived until you scream inside of a cave.  Your voice bounces off all of the walls and other crap around you, and becomes distorted.  This same thing happens with other radio signals.  The higher number of waveform/symbol possibilities you have in the QAM scheme, the more difficult this error correction becomes.  The number of possible symbols in a QAM scheme are prepended to the name (16-QAM, 256-QAM, etc).  The higher the QAM scheme, the more 'dense' the possibilities are, and the more difficult it is to error correct.  You may have to re-read this paragraph a few times and look at my highly accurate MS paint drawings again, because there is no real easy way to describe QAM other than "The amplitude of 2 out of phase waves are modulated and then summed up for a phase and amplitude modulated carrier wave that is then mapped to corresponding symbols on a polar graph", or possibly more simple "Radio waves are dicked with in a predictable way where they mean numbers on the receiver".  Here's a sample 4-QAM graph of this.  

 Now that we know how the devices can share the airtime, and encode data to transmit to a receiver, how do they do this securely?  There are a bunch of encryption schemes out there, but the one that is most prevalent in modern household wifi networks is WPA2-PSK, or Wifi Protected Access 2 with Pre-Shared Key. The goal is to prove to the AP that the station knows the WiFi password without ever sending it over the wire, and to generate a unique key used to keep the data encrypted form other stations on the same WiFi network.  This unique key (called a pairwise transient key, or PTK) is constructed by combining the following information: Pairwise Master Key (WiFi password), random number generated by AP, random number generated by endpoint, AP hardware address, endpoint hardware address.  Looking at this information, all of the information should be known to both the AP and endpoint, except for the random numbers.  The first step is that the AP sends the client a random number, in which case the client now has all of the information needed to construct the unique encryption key to be used.  The endpoint then sends its random number to the AP, which provides it will all of the information needed to construct that same encryption key to be used.  Since these keys rely on a shared secret (wifi password), it is possible to prove that the client knows the key by creating matching calculations, without ever having to send the password over the wire.  After the PTK is established between the AP and endpoint, the AP then sends the endpoint what's called the Group Temporal Key (GTK) to the endpoint encrypted with the newly established PTK.  The Group key is used for communications that the AP needs to make to all of the stations, such as asking everyone a question, or telling everyone a new piece of information.  This way, there can be both a shared encrypted communication medium, and also a personal encrypted communication medium.  Here's yet another diagram to depict this.

 

So yeah, now you know some of the incredible technology developed for you to order a gallon of fox piss on your phone.  THE END.