In fact, it is useful to master an advanced tool similar to a framework, but the basic things can make you never be eliminated and not be limited by tools.
Today, we don't use the advanced package in the framework or Python standard library, but only use the socket interface in the standard library to write a Python server.
Frame and bottom layer
In today's era when Python server frameworks (such as Django, Twisted, web.py, etc.) are prevalent. ), it seems to be a thankless and stupid practice to write servers from the bottom socket.
The significance of the framework lies in covering up the underlying details, providing a more developer-friendly API to deal with layout problems such as MVC.
This framework allows us to quickly build a mature Python server. However, the framework itself also depends on the underlying layer (such as socket). Understanding the underlying sockets can not only help us use the framework better, but also let us know how the framework is designed.
In addition, if you have a good understanding of the underlying socket programming and other system programming, you can design and develop your own framework.
If you can start from the bottom socket, realize a complete Python server, support user-layer protocols, deal with MVC (Model-View-Control) and multithreading, and sort out a set of clear functions or classes to present to users as interfaces (APIs), you are equivalent to designing a framework.
Socket interface is actually a system call provided by the operating system.
The use of socket is not limited to Python language. You can write the same socket server in C or Java, and all languages use socket in the same way (Apache is a server implemented in C).
But you can't use frameworks across languages.
The advantage of the framework is to help you deal with some details, so as to achieve rapid development, but it is also limited by Python's own performance.
We have seen that many successful websites are developed rapidly in dynamic languages (such as Python, Ruby or PHP, such as twitter and facebook). After the success of the website, the code will be converted into some efficient languages such as C and JAVA, so that the server can face hundreds of millions of requests every day more efficiently.
In this case, the importance of the bottom goes far beyond the framework.
Brief introduction of TCP/IP and socket
Back to our mission.
We need to know something about network transmission, especially TCP/IP protocol and socket.
Socket is a method of inter-process communication, and it is an upper interface based on network transmission protocol.
There are many types of sockets, such as based on TCP protocol or UDP protocol (two network transmission protocols), among which TCP socket is the most commonly used.
TCP sockets are somewhat similar to duplex pipes. One process writes or reads a text stream to one end of the socket, while another process can read or write from the other end of the socket. In particular, the two processes that establish socket communication can belong to two different computers.
TCP protocol provides some communication rules, so that the communication process between the above processes can be effectively realized in the network environment.
Duplex pipes exist in the same computer, so it is not necessary to distinguish the addresses of the computers where the two processes are located. Sockets must contain address information to realize network communication.
A socket contains four address information: the IP addresses of two computers and the ports used by two processes. IP addresses are used to locate computers, while ports are used to locate processes (multiple processes can use different ports on a computer).
TCP socket
On the Internet, let a computer act as a server.
The server opens its own port and passively waits for other computers to connect.
When other computers as customers actively use socket to connect to the server, the server begins to provide services for customers.
In Python, we use the socket package in the standard library for the underlying socket programming.
First of all, the server side, we use the bind () method to give the socket a fixed address and port, and use the listen () method to passively listen to the port.
When a customer tries to connect using the connect () method, the server uses accept () to accept the connection, thus establishing a connected socket:
Socket.Socket () creates a socket object, explaining that socket uses IPv4(AF_INET, IP version 4) and TCP protocol (SOCK_STREAM).
Then use another computer as a customer, and we actively use the connect () method to search the IP address and port of the server (in Linux, you can use $ifconfig to query your own IP address), so that the customer can find the server and establish a connection:
In the above example, we can call the recv () method to receive information at both ends of the socket, and call the sendall () method to send information.
In this way, we can communicate between two processes on two computers.
When the communication ends, we use the close () method to close the socket connection.
(If you don't have two computers to do the experiment, you can also change the IP to which the client IP is connected to "127.0.0. 1", which is a private IP address for connecting the local host. )
HTTP server based on TCP socket
In the above example, we have been able to establish a connection between two remote computers using TCP sockets.
However, the freedom of socket transmission is too high, which brings many security and compatibility problems.
We often use some application layer protocols (such as HTTP protocol) to specify the usage rules of socket and the format of transmitting information.
HTTP protocol uses TCP sockets in a request-response manner.
The client sends a piece of text to the server as a request, and the server sends a piece of text to the client as a response after receiving the request.
After completing such a request-response transaction, TCP socket is abandoned.
The next request will create a new socket.
The request and response are essentially two kinds of texts, but the HTTP protocol has certain format requirements for both texts.
Request <->; reaction
Now, let's write an HTTP server:
HTTP server program description
As we saw above, the server will send one of two messages, text_content and pic_content, to the customer as the response text according to the request.
The whole response is divided into three parts: the starting line, the header information and the body. The starting line is the first line:
In fact, it is divided into three sections with spaces, HTTP/ 1.x indicates the HTTP version used, 200 indicates the status code, 200 is specified in the HTTP protocol, indicating that the server normally receives and processes requests, and OK is the status code for people to read.
The title information follows the start line, and there is a blank line between it and the text.
The text_content or pic_content here has only one line header information, and the text_content type used to represent the main information is html text:
Header information of pic _ content (content-type: image/jpg) indicates that the type of the subject is jpg picture (image/jpg).
The main information is the content of html or jpg files.
(Note that for jpg files, we open them in "rb" mode for compatibility with windows. Because under windows, jpg is considered as a binary file, under UNIX system, there is no need to distinguish between text files and binary files. )
We didn't write the client program, and we will use the browser as the client later.
The client program sends the request to the server.
Although a request can be divided into three parts like a response, the format of the request is different from that of the response.
The request is sent from the client to the server. For example, the following is a request:
The starting line can be divided into three parts, the first part is the request method, the second part is the URL, and the third part is the HTTP version.
The request methods can be GET, PUT, POST, DELETE and HEAD. The most commonly used are GET and POST.
GET is to request the server to send resources to customers, and POST is to request the server to receive data sent by customers.
When we open a web page, we usually use the GET method; When we fill out the form and submit it, we usually use the POST method.
The second part is URL, which usually points to a resource (resources on the server or resources elsewhere). As now, it is the test.jpg that points to the current directory of the current server.
According to the HTTP protocol, the server needs to perform some operations according to the request.
In the server program, we can see that our Python program first checks the requested method, and then generates different responses (text_content _ content or pic_content) according to different URLs.
The response is then sent back to the client.
Experience browser
In order to cooperate with the above server program, I saved a test.jpg picture file in the folder where the Python program was placed.
We run a Python program on the terminal as a server, and then open a browser as a client.
(You can also write a client in Python if you have time. The principle is similar to the client program of TCP socket above. )
In the browser's address bar, enter:
(Of course, you can also use a computer to enter the IP address of the server. )
Well, I already have a server implemented in Python and written in socket.
From the terminal, we can see that the browser actually made two requests.
The first request is (the key information is in the starting line, and the body of this request is empty):
According to this request, our Python program sends the contents of text_content to the server.
After the browser receives the text_content, it finds that
After analyzing the initial line, our Python program found that /test.jpg met the if condition, so it sent pic_content to the customer.
Finally, the browser displays html text and pictures in an appropriate way according to the syntax of the html language.
The direction of exploration
1) In our server program above, we used a while loop to keep the server working.
In fact, we can also change the content in the while loop into multi-process or multi-thread work according to the knowledge of multi-threading.
2) Our server program is not perfect, so we can let our Python program call other Python functions to achieve more complex functions. For example, make a time server and let the server return the date and time to the customer. You can also use Python's own database to realize a complete LAMP server.
3) socket package is a relatively low-level package. There are also advanced packages in Python standard library, such as SocketServer, SimpleHTTPServer, cgiHTTPServer and CGI. All these software packages are helping us to use socket more easily. If you already know socket, then these packages are easy to understand. With these advanced packages, you can write a fairly mature server.
4) After going through a lot of hardships, you may find the framework so convenient that you decide to use it. Or, you already have the enthusiasm to participate in framework development.