What happens when you type https://www.google.com and press <ENTER> ?

What happens when you type https://www.google.com and press <ENTER> ?

What happens when you type https://www.google.com and press  ?

Google's website is stored somewhere in a computer that we can reach over the internet. That computer is called a server (associated with a web server) as its role is to provide or serve the content of www.gogole.com.

In a nutshell, when you type www.google.com and press <ENTER>, your browser will fetch the content of the URL www.google.com from the server hosting it and then will parse and render that content.

While this process can take just a few seconds, there are many steps involved in it. Let's break them down and see what happens in more detail.

1 - But how can your browser reach Google's server from the URL www.google.com ?

Internet protocol

The internet protocol is the protocol responsible for allowing your browser to reach Google's server and exchange data with it.

To put it simply, IP allows to uniquely identify any device (its network interfaces) in a network with a numerical number called IP address and provides a standard format to send any data between devices in a network so that the intended data is sent to the intended device.

Addressing

The process of mapping numerical numbers to network interfaces of a device in a network is called addressing.

IP datagram

An IP datagram is the standard format by which data are exchanged in the Internet Protocol.

IP versions

There are two versions of IP :

  • IPv4 which uses 32 bits IP address format;

  • IPv6 which uses 128 bits IP address format.

DNS

The Domain Name System aka DNS is a distributed database that maps IP addresses to domain names.

Knowing that your browser can connect to Google's server using its IP address, it becomes obvious that the first thing the browser will do is to find the IP address of Google's server and it does so by initiating a DNS query via the Operating system's resolver which is a program that will query the DNS to get the IP address of the domain google.com.

2- Transmission Control Protocol

Now that your browser knows Google's IP address, it can now exchange data with Google's server via the Internet Protocol, right?

Well, the answer is, yes and no. While IP provides a means for exchanging data between hosts, it is stateless and does not ensure that the data are delivered to the destination.

That's why, on top of IP, your browser will use another protocol that will make it possible for both ends, to make sure that the data they are exchanging reaches their destinations. As its name implies, this protocol is the Transmission Control Protocol AKA TCP.

TCP segments and sequence numbers

TCP also provides its data format, TCP segments, that are encapsulated into an IP datagram (remember, IP datagrams can contain any data) and exchanged with peers in an IP network.

Each peer keeps track of two numbers :

  • A sequence number (SEQ) which represents the index of the last byte sent to the other side

  • And an acknowledgment (ACK) number which represents the index of the last byte of data it expects to receive from the other end. This means that all bytes of data up to the ACK number (not including) have been received.

This information is exchanged for each request and after one peer receives data from the other, it must send an acknowledgment TCP segment, to confirm that the data were correctly delivered.

This is what allows TCP to be a reliable protocol, by allowing the sender to resend the data if for whatever reason, the other peer didn't receive it, by using the SEQ number and ACK numbers of both sides.

TCP connection

Your browser will initiate a new TCP connection with Google's server , on port 443 (default port for https), in three steps.

3 Way TCP handshake

Three-way TCP handshake. source (rfc9293)

1 - SYN :

Your browser will generate a random sequence number SEQ1 and send a SYN (synchronize) TCP segment to Google's server.

2 - SYN-ACK :

Google's server will set its ACK number, ACK2 to SEQ1 + 1 and also generate a random sequence number SEQ2. As a response to the SYN TCP segment from your browser, google will respond by sending a SYN-ACK (synchronize and acknowledge) TCP segment.

3 - ACK :

Finally, Your browser will set its acknowledgment number, ACK1 to SEQ2 + 1.

TLS handshake

TCP and IP do not provide any security mechanism, someone can intercept the data being exchanged and read their content as the data are encoded in plain in both protocols and even send data to one peer by pretending to be the other peer, by simply setting the right source address and destination address in the IP datagram.

Transport Layer Security(TLS) ensures data privacy (data are encrypted, so that only the peers may read it) and data integrity (each peer can be sure that the data they receive only comes from the other one ) between two communicating applications.

In our case TLS is layered on top of HTTP, this gives us a secure HTTP or HTTPS.

A secure TLS session must be established between the two peers, and this is done through what's called a TLS handshake where the cryptographic parameters of the session are produced and exchanged.

TLS handshake. Source RFC 5246

3- HTTP request/response

HTTP is the protocol that your browser and the web server running on Google's server understand. It's a text-based protocol.

HTTP request

Your browser will then create a TCP segment containing an HTTP request (a GET request) similar to the following (the value of some headers may vary) :

GET / HTTP/1.1
Host: google.com

Then the resulting TCP segment will be encapsulated in an IP datagram and sent
to Google's server.

HTTP response

Google's server will receive the TCP segment, and read the data that contains the HTTP message, that message will then be passed to the web server which will handle the request, generate an HTML page and send back an HTTP response containing the generated HTML page.

HTTP/1.1 200 OK
Content-Type: text/html; charset=UTF-8
Content-Length: 155
Connection: close

<html>
  <head>
    <title>Google</title>
    <link type="text/css" rel="stylesheet" href="https://www.gstatic.com/og/style.css">
  </head>
  <body>
    <p>Beautiful google's home page.</p>
    <script src="https://www.gstatic.com/script.js"></script>
  </body>
</html>

As for the browser, the server will create a TCP segment containing the HTTP response, encapsulate the segment in an IP datagram and send it back to your browser.

4 - Response processing and rendering

From html to the rendered page

Firefox browser, showing the HTML content and the rendered page.

The browser will receive the HTTP response which contains the HTML page.

The browser parses the HTML file first and generates a Document Object Model (DOM) tree, as the parsing of the HTML progresses, the HTML content is rendered on the screen.

While parsing the HTML, the browser sends new requests for any CSS and JavaScript files found respectively from <link> and <script> elements as well as for any other external resources such as images, videos...

From the CSS files, the browser also generates a CSS Object Model (CSSOM) tree, and then from this tree, the styles are applied to the page. The JavaScript files are also compiled and executed, and the page becomes interactive.

Conclusion

There are many processes and protocols involved from the moment a user searches through his browser to the moment he can see an HTML page in his browser. This article just covered them at a high level without going into the details.

Please, tell me what you think in the comment, did you learn something? or did you find something?