Detailed explanation of Python—requests module

1. Module description

requests is an HTTP library under the Apache2 licensed license.

Written in python.

More concise than the urllib2 module.

Request supports HTTP connection retention and connection pooling, supports the use of cookies to maintain sessions, supports file upload, supports automatic response content encoding, and supports internationalized URL and POST data automatic encoding.

It is highly encapsulated on the basis of python's built-in modules, so that when python makes network requests, it becomes user-friendly. Using Requests, you can easily complete any operations that browsers can have.

Modern, international and friendly.

requests will automatically implement persistent connection keep-alive

2. Basic introduction

1) Import the module

import requests

2) Conciseness of sending requests

  Sample code: get a web page (personal github)

import requests

r = requests.get( ' https://github.com/Ranxf ' )        #The most basic get request without parameters 
r1 = requests.get(url= ' http://dict.baidu.com/s ' , params={ ' wd ' : ' python ' })       #get request with parameters

We can use this method using the following methods

1   requests.get(‘https://github.com/timeline.json’)                                # GET请求
2   requests.post(“http://httpbin.org/post”)                                        # POST请求
3   requests.put(“http://httpbin.org/put”)                                          # PUT请求
4   requests.delete(“http://httpbin.org/delete”)                                    # DELETE请求
5   requests.head(“http://httpbin.org/get”)                                         # HEAD请求
6   requests.options(“http://httpbin.org/get” )                                     # OPTIONS请求

3) Pass parameters for the url

>>> url_params = { ' key ' : ' value ' }        #dictionary     passing parameters, if the value of None key will not be added to the url 
>>> r = requests.get( ' your url ' ,params = url_params)
 >>> print (r.url)
  your url?key=value

4) The content of the response

copy code
r.encoding #Get                        the current encoding 
r.encoding = ' utf-8 '              #Set the encoding 
r.text #Parse                            the returned content with encoding. The response body in string mode will be automatically decoded according to the character encoding of the response header. 
r.content #returned                         in bytes (binary). Response body in bytes, gzip and deflate compression will be automatically decoded for you. 

r.headers #Store                         the server response header with a dictionary object, but this dictionary is special, the dictionary key is not case-sensitive, if the key does not exist, it will return None 

r.status_code #Response                      status code 
r.raw #Return                              the original response body, that is The response object of urllib, use r.raw.read()    
r.ok                               #Check the boolean value of r.ok to know whether the login is successful
 # *Special method*# 
r.json()                          # The built-in JSON decoder in Requests is returned in json format, provided that the returned content is in json format, otherwise an exception will be thrown when parsing error 
r.raise_for_status() #Failed              request ( non-200 response) throws an exception
copy code

post send json request:

1 import requests
2 import json
3  
4 r = requests.post('https://api.github.com/some/endpoint', data=json.dumps({'some': 'data'}))
5 print(r.json())

5) Custom headers and cookie information

header = {'user-agent': 'my-app/0.0.1''}
cookie = {'key':'value'}
 r = requests.get/post('your url',headers=header,cookies=cookie) 
data = {'some': 'data'}
headers = {'content-type': 'application/json',
           'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:22.0) Gecko/20100101 Firefox/22.0'}
 
r = requests.post('https://api.github.com/some/endpoint', data=data, headers=headers)
print(r.text)

6) Response status code

After using the requests method, a response object will be returned, which stores the content of the server response, such as r.text, r.status_code already mentioned in the above example...
Get the response body instance in text mode: when you access the r.text , it will use the text encoding of its response for decoding, and you can modify its encoding to let r.text use a custom encoding for decoding.

1 r = requests.get('http://www.itwhy.org')
2 print(r.text, '\n{}\n'.format('*'*79), r.encoding)
3 r.encoding = 'GBK'
4 print(r.text, '\n{}\n'.format('*'*79), r.encoding)

Sample code:

copy code
1 import requests
2 
3 r = requests.get( ' https://github.com/Ranxf ' )        #The most basic get request without parameters 
4 print (r.status_code) #Get                                the return status 
5 r1 = requests.get(url= ' http://dict.baidu.com/s ' , params={ ' wd ' : ' python ' })       #get request with parameters 
6 print (r1.url)
 7 print (r1.text) #print         the decoded return data
copy code

operation result:

/usr/bin/python3.5 /home/rxf/python3_1000/1000/python3_server/python3_requests/ demo1.py
 200 
http: //dict.baidu.com/s?wd= python
…………

Process finished with exit code 0
r.status_code #If                       it is not 200, you can use r.raise_for_status() to throw an exception

7) Response

r.headers #Return                                   dictionary type, header information 
r.requests.headers #Return                          header information sent to the server 
r.cookies #Return                                   cookie r.history 
#Return                                   redirection information, of course, you can add allow_redirects = false to the request to prevent re- direction Orientation

8) Timeout

 

r = requests.get( ' url ' ,timeout=1)            #Set the timeout in seconds, only valid for connections

 

9) Session object, capable of persisting certain parameters across requests

s = requests.Session()
s.auth = ('auth','passwd')
s.headers = {'key':'value'}
r = s.get('url')
r1 = s.get('url1') 

10) Proxy

 

proxies = {'http':'ip1','https':'ip2' }
requests.get('url',proxies=proxies)

 

Summary:

copy code
# HTTP request type 
# get type 
r = requests.get( ' https://github.com/timeline.json ' )
 # post type 
r = requests.post( " http://m.ctrip.com/post " )
 # put type 
r = requests.put( " http://m.ctrip.com/put " )
 # delete type 
r = requests.delete( " http://m.ctrip.com/delete " )
 # head type 
r = requests.head( " http://m.ctrip.com/head " )
 # options type 
r = requests.options( "http://m.ctrip.com/get " )

#Get the response content 
print (r.content) #Display in bytes, Chinese as characters 
print (r.text) #Display in text

# URL pass parameter 
payload = { ' keyword ' : ' Hong Kong ' , ' salecityid ' : ' 2 ' }
r = requests.get( " http://m.ctrip.com/webapp/tourvisa/visa_list " , params= payload) 
 print (r.url) #The example is http://m.ctrip.com/webapp/tourvisa /visa_list?salecityid=2&keyword=Hong Kong

# Get/modify web page encoding 
r = requests.get( ' https://github.com/timeline.json ' )
 print (r.encoding)


# json processing 
r = requests.get( ' https://github.com/timeline.json ' )
 print (r.json()) #need to import json first    

#custom request header 
url = ' http://m.ctrip.com ' 
headers = { ' User-Agent ' : ' Mozilla/5.0 (Linux; Android 4.2.1; en-us; Nexus 4 Build/JOP40D) AppleWebKit/ 535.19 (KHTML, like Gecko) Chrome/18.0.1025.166 Mobile Safari/535.19 ' }
r = requests.post(url, headers=headers)
print (r.request.headers)

#Complex post request 
url = ' http://m.ctrip.com ' 
payload = { ' some ' : ' data ' }
r = requests.post(url, data=json.dumps(payload)) #If the passed payload is a string instead of a dict, you need to call the dumps method to format it first

# post multipart encoded file 
url = ' http://m.ctrip.com ' 
files = { ' file ' : open( ' report.xls ' , ' rb ' )}
r = requests.post(url, files=files)

#Response status code 
r = requests.get( ' http://m.ctrip.com ' )
 print (r.status_code )
    
#Response header 
r = requests.get( ' http://m.ctrip.com ' )
 print ( r.headers )
 print (r.headers[ ' Content-Type ' ])
 print (r.headers.get( ' content -type ' )) #two ways to access the content of the response header
    
# Cookies
url = 'http://example.com/some/cookie/setting/url'
r = requests.get(url)
r.cookies['example_cookie_name']    #读取cookies
    
url = 'http://m.ctrip.com/cookies'
cookies = dict(cookies_are='working')
r = requests.get(url, cookies=cookies) #send cookies

#Set the timeout time 
r = requests.get( ' http://m.ctrip.com ' , timeout=0.001 )

#Set access proxy 
proxies = {
            " http " : " http://10.10.1.10:3128 " ,
            " https " : " http://10.10.1.100:4444 " ,
          }
r = requests.get('http://m.ctrip.com', proxies=proxies)


#If the proxy requires a username and password, this is required: 
proxies = {
     " http " : " http://user:[email protected]:3128/ " ,
}
copy code
copy code
# HTTP request type 
# get type 
r = requests.get( ' https://github.com/timeline.json ' )
 # post type 
r = requests.post( " http://m.ctrip.com/post " )
 # put type 
r = requests.put( " http://m.ctrip.com/put " )
 # delete type 
r = requests.delete( " http://m.ctrip.com/delete " )
 # head type 
r = requests.head( " http://m.ctrip.com/head " )
 # options type 
r = requests.options( "http://m.ctrip.com/get " )

#Get the response content 
print (r.content) #Display in bytes, Chinese as characters 
print (r.text) #Display in text

# URL pass parameter 
payload = { ' keyword ' : ' Hong Kong ' , ' salecityid ' : ' 2 ' }
r = requests.get( " http://m.ctrip.com/webapp/tourvisa/visa_list " , params= payload) 
 print (r.url) #The example is http://m.ctrip.com/webapp/tourvisa /visa_list?salecityid=2&keyword=Hong Kong

# Get/modify web page encoding 
r = requests.get( ' https://github.com/timeline.json ' )
 print (r.encoding)


# json processing 
r = requests.get( ' https://github.com/timeline.json ' )
 print (r.json()) #need to import json first    

#custom request header 
url = ' http://m.ctrip.com ' 
headers = { ' User-Agent ' : ' Mozilla/5.0 (Linux; Android 4.2.1; en-us; Nexus 4 Build/JOP40D) AppleWebKit/ 535.19 (KHTML, like Gecko) Chrome/18.0.1025.166 Mobile Safari/535.19 ' }
r = requests.post(url, headers=headers)
print (r.request.headers)

#Complex post request 
url = ' http://m.ctrip.com ' 
payload = { ' some ' : ' data ' }
r = requests.post(url, data=json.dumps(payload)) #If the passed payload is a string instead of a dict, you need to call the dumps method to format it first

# post multipart encoded file 
url = ' http://m.ctrip.com ' 
files = { ' file ' : open( ' report.xls ' , ' rb ' )}
r = requests.post(url, files=files)

#Response status code 
r = requests.get( ' http://m.ctrip.com ' )
 print (r.status_code )
    
#Response header 
r = requests.get( ' http://m.ctrip.com ' )
 print ( r.headers )
 print (r.headers[ ' Content-Type ' ])
 print (r.headers.get( ' content -type ' )) #two ways to access the content of the response header
    
# Cookies
url = 'http://example.com/some/cookie/setting/url'
r = requests.get(url)
r.cookies['example_cookie_name']    #读取cookies
    
url = 'http://m.ctrip.com/cookies'
cookies = dict(cookies_are='working')
r = requests.get(url, cookies=cookies) #send cookies

#Set the timeout time 
r = requests.get( ' http://m.ctrip.com ' , timeout=0.001 )

#Set access proxy 
proxies = {
            " http " : " http://10.10.1.10:3128 " ,
            " https " : " http://10.10.1.100:4444 " ,
          }
r = requests.get('http://m.ctrip.com', proxies=proxies)


#If the proxy requires a username and password, this is required: 
proxies = {
     " http " : " http://user:[email protected]:3128/ " ,
}
copy code

3. Sample code

GET request

copy code
1 # 1. No parameter instance
 2   
 3 import requests
 4   
 5 ret = requests.get('https://github.com/timeline.json')
 6   
 7 print(ret.url)
 8 print(ret.text)
 9   
10   
11   
12 # 2. There are parameter instances
13   
14 import requests
15   
16 payload = {'key1': 'value1', 'key2': 'value2'}
17 ret = requests.get("http://httpbin.org/get", params=payload)
18   
19 print(ret.url)
20 print(ret.text)
copy code

POST request

copy code
# 1. Basic POST instance
  
import requests
  
payload = {'key1': 'value1', 'key2': 'value2'}
ret = requests.post("http://httpbin.org/post", data=payload)
  
print(ret.text)
  
  
# 2. Send request header and data instance
  
import requests
import json
  
url = 'https://api.github.com/some/endpoint'
payload = {'some': 'data'}
headers = {'content-type': 'application/json'}
  
ret = requests.post(url, data=json.dumps(payload), headers=headers)
  
print(ret.text)
print(ret.cookies)
copy code

request parameter

def request(method, url, **kwargs):
    """Constructs and sends a :class:`Request <Request>`.

    :param method: method for the new :class:`Request` object.
    :param url: URL for the new :class:`Request` object.
    :param params: (optional) Dictionary or bytes to be sent in the query string for the :class:`Request`.
    :param data: (optional) Dictionary, bytes, or file-like object to send in the body of the :class:`Request`.
    :param json: (optional) json data to send in the body of the :class:`Request`.
    :param headers: (optional) Dictionary of HTTP Headers to send with the :class:`Request`.
    :param cookies: (optional) Dict or CookieJar object to send with the :class:`Request`.
    :param files: (optional) Dictionary of ``'name': file-like-objects`` (or ``{'name': file-tuple}``) for multipart encoding upload.
        ``file-tuple`` can be a 2-tuple ``('filename', fileobj)``, 3-tuple ``('filename', fileobj, 'content_type')``
        or a 4-tuple ``('filename', fileobj, 'content_type', custom_headers)``, where ``'content-type'`` is a string
        defining the content type of the given file and ``custom_headers`` a dict-like object containing additional headers
        to add for the file.
    :param auth: (optional) Auth tuple to enable Basic/Digest/Custom HTTP Auth.
    :param timeout: (optional) How long to wait for the server to send data
        before giving up, as a float, or a :ref:`(connect timeout, read
        timeout) <timeouts>` tuple.
    :type timeout: float or tuple
    :param allow_redirects: (optional) Boolean. Set to True if POST/PUT/DELETE redirect following is allowed.
    :type allow_redirects: bool
    :param proxies: (optional) Dictionary mapping protocol to the URL of the proxy.
    :param verify: (optional) whether the SSL cert will be verified. A CA_BUNDLE path can also be provided. Defaults to ``True``.
    :param stream: (optional) if ``False``, the response content will be immediately downloaded.
    :param cert: (optional) if String, path to ssl client cert file (.pem). If Tuple, ('cert', 'key') pair.
    :return: :class:`Response <Response>` object
    :rtype: requests.Response

    Usage::

      >>> import requests
      >>> req = requests.request('GET', 'http://httpbin.org/get')
      <Response [200]>
    """

parameter list

request parameter
request parameter
def param_method_url():
    # requests.request(method='get', url='http://127.0.0.1:8000/test/')
    # requests.request(method='post', url='http://127.0.0.1:8000/test/')
    pass


def param_param():
     # - can be a dictionary 
    # - can be a string 
    # - can be a byte (within ascii encoding)

    # requests.request(method='get', 
    # url='http://127.0.0.1:8000/test/', 
    # params={'k1': 'v1', 'k2': 'Utilities'} )

    # requests.request(method='get', 
    # url='http://127.0.0.1:8000/test/', 
    # params="k1=v1&k2=Utilities&k3=v3&k3=vv3")

    # requests.request(method='get',
    # url='http://127.0.0.1:8000/test/',
    # params=bytes("k1=v1&k2=k2&k3=v3&k3=vv3", encoding='utf8'))

    # ERROR 
    # requests.request(method='get', 
    # url='http://127.0.0.1:8000/test/', 
    # params=bytes("k1=v1&k2=Utility&k3=v3&k3=vv3", encoding='utf8')) 
    pass


def param_data(): #can
     be a dictionary # can be a string # can be a byte # can be a file object
    
    
    

    # requests.request(method='POST', 
    # url='http://127.0.0.1:8000/test/', 
    # data={'k1': 'v1', 'k2': 'Utilities'} )

    # requests.request(method='POST',
    # url='http://127.0.0.1:8000/test/',
    # data="k1=v1; k2=v2; k3=v3; k3=v4"
    # )

    # requests.request(method='POST',
    # url='http://127.0.0.1:8000/test/',
    # data="k1=v1;k2=v2;k3=v3;k3=v4",
    # headers={'Content-Type': 'application/x-www-form-urlencoded'}
    # )

    # requests.request(method='POST',
    # url='http://127.0.0.1:8000/test/',
    # data=open('data_file.py', mode='r', encoding='utf-8'), # 文件内容是:k1=v1;k2=v2;k3=v3;k3=v4
    # headers={'Content-Type': 'application/x-www-form-urlencoded'}
    # )
    pass


def param_json(): #Serialize
     the corresponding data in json into a string, json.dumps(...) #Then 
    send it to the body of the server, and the Content-Type is {'Content-Type': 'application /json'} 
    requests.request(method= ' POST ' ,
                     url='http://127.0.0.1:8000/test/',
                     json ={ ' k1 ' : ' v1 ' , ' k2 ' : ' Utilities ' })


def param_headers(): #Send
     request headers to the server 
    requests.request(method= ' POST ' ,
                     url='http://127.0.0.1:8000/test/',
                     json ={ ' k1 ' : ' v1 ' , ' k2 ' : ' Utilities ' },
                     headers={'Content-Type': 'application/x-www-form-urlencoded'}
                     )


def param_cookies(): #Send
     cookies to the server 
    requests.request(method= ' POST ' ,
                     url='http://127.0.0.1:8000/test/',
                     data={'k1': 'v1', 'k2': 'v2'},
                     cookies={'cook1': 'value1'},
                     )
    #You can also use CookieJar (the dictionary form is encapsulated on this basis) 
    from http.cookiejar import CookieJar
     from http.cookiejar import Cookie

    obj = CookieJar()
    obj.set_cookie(Cookie(version=0, name='c1', value='v1', port=None, domain='', path='/', secure=False, expires=None,
                          discard=True, comment=None, comment_url=None, rest={'HttpOnly': None}, rfc2109=False,
                          port_specified=False, domain_specified=False, domain_initial_dot=False, path_specified=False)
                   )
    requests.request(method='POST',
                     url='http://127.0.0.1:8000/test/',
                     data={'k1': 'v1', 'k2': 'v2'},
                     cookies=obj)


def param_files():
    # 发送文件
    # file_dict = {
    # 'f1': open('readme', 'rb')
    # }
    # requests.request(method='POST',
    # url='http://127.0.0.1:8000/test/',
    # files=file_dict)

    # Send file, custom file name 
    # file_dict = { 
    # 'f1': ('test.txt', open('readme', 'rb')) 
    # } 
    # requests.request(method='POST', 
    # url= 'http://127.0.0.1:8000/test/', 
    # files=file_dict)

    # Send file, custom file name 
    # file_dict = { 
    # 'f1': ('test.txt', "hahsfaksfa9kasdjflaksdjf") 
    # } 
    # requests.request(method='POST', 
    # url='http://127.0. 0.1:8000/test/', 
    # files=file_dict)

    #Send file , custom file name 
    # file_dict = { #      'f1': ('test.txt', "hahsfaksfa9kasdjflaksdjf", 'application/text', {'k1': '0'}) # } # requests.request( method='POST', #                   url='http://127.0.0.1:8000/test/', #                   files=file_dict)
    
    
    
    
    

    pass


def param_auth():
    from requests.auth import HTTPBasicAuth, HTTPDigestAuth

    ret = requests.get('https://api.github.com/user', auth=HTTPBasicAuth('wupeiqi', 'sdfasdfasdf'))
    print(ret.text)

    # ret = requests.get('http://192.168.1.1',
    # auth=HTTPBasicAuth('admin', 'admin'))
    # ret.encoding = 'gbk'
    # print(ret.text)

    # ret = requests.get('http://httpbin.org/digest-auth/auth/user/pass', auth=HTTPDigestAuth('user', 'pass'))
    # print(ret)
    #


def param_timeout():
    # ret = requests.get('http://google.com/', timeout=1)
    # print(ret)

    # ret = requests.get('http://google.com/', timeout=(5, 1))
    # print(ret)
    pass


def param_allow_redirects():
    ret = requests.get('http://127.0.0.1:8000/test/', allow_redirects=False)
    print(ret.text)


def param_proxies():
    # proxies = {
    # "http": "61.172.249.96:80",
    # "https": "http://61.185.219.126:3128",
    # }

    # proxies = {'http://10.20.1.128': 'http://10.10.1.10:5323'}

    # ret = requests.get("http://www.proxy360.cn/Proxy", proxies=proxies)
    # print(ret.headers)


    # from requests.auth import HTTPProxyAuth
    #
    # proxyDict = {
    # 'http': '77.75.105.165',
    # 'https': '77.75.105.165'
    # }
    # auth = HTTPProxyAuth('username', 'mypassword')
    #
    # r = requests.get("http://www.google.com", proxies=proxyDict, auth=auth)
    # print(r.text)

    pass


def param_stream():
    ret = requests.get('http://127.0.0.1:8000/test/', stream=True)
    print(ret.content)
    ret.close()

    # from contextlib import closing
    # with closing(requests.get('http://httpbin.org/get', stream=True)) as r:
    # # 在此处理响应。
    # for i in r.iter_content():
    # print(i)


def requests_session():
    import requests

    session = requests.Session()

    # ## 1. First log in to any page, get the cookie 

    i1 = session.get(url= " http://dig.chouti.com/help/service " )

    # ## 2. The user logs in, carries the last cookie, and the background authorizes the gpsd in the cookie 
    i2 = session.post(
        url="http://dig.chouti.com/login",
        data={
            'phone': "8615131255089",
            'password': "xxxxxx",
            'oneMonth': ""
        }
    )

    i3 = session.post(
        url="http://dig.chouti.com/link/vote?linksId=8589623",
    )
    print(i3.text)
parameter sample code

json request:

copy code
#! /usr/bin/python3
import requests
import json


class url_request():
    def __init__(self):
        ''' init '''

if __name__ == '__main__':
    heard = {'Content-Type': 'application/json'}
    payload = { ' CountryName ' : ' China ' ,
                ' ProvinceName ' : ' Sichuan Province ' ,
                ' L1CityName ' : ' chengdu ' ,
                ' L2CityName ' : ' yibing ' ,
                ' TownName ' : '' ,
                ' Longitude ' : ' 107.33393 ',
               'Latitude': '33.157131',
               'Language': 'CN'}
    r = requests.post("http://www.xxxxxx.com/CityLocation/json/LBSLocateCity", heards=heard, data=payload)
    data = r.json()
     if r.status_code!=200 :
         print ( ' LBSLocateCity API Error ' + str(r.status_code))
     print (data[ ' CityEntities ' ][0][ ' CityID ' ])   #Print return The value of a key in json 
    print (data[ ' ResponseStatus ' ][ ' Ack ' ])
     print (json.dump(data, indent=4, sort_keys=True, ensure_ascii=False))   #tree print json, ensure_ascii Must be set to False otherwise Chinese will be displayed as unicode
copy code

Xml request:

copy code
#! /usr/bin/python3
import requests

class url_request():
    def __init__(self):
        """init"""

if __name__ == '__main__':
    heards = {'Content-type': 'text/xml'}
    XML = '<?xml version="1.0" encoding="utf-8"?><soap:Envelope xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"><soap:Body><Request xmlns="http://tempuri.org/"><jme><JobClassFullName>WeChatJSTicket.JobWS.Job.JobRefreshTicket,WeChatJSTicket.JobWS</JobClassFullName><Action>RUN</Action><Param>1</Param><HostIP>127.0.0.1</HostIP><JobInfo>1</JobInfo><NeedParallel>false</NeedParallel></jme></Request></soap:Body></soap:Envelope>'
    url = 'http://jobws.push.mobile.xxxxxxxx.com/RefreshWeiXInTokenJob/RefreshService.asmx'
    r = requests.post(url=url, heards=heards, data=XML)
    data = r.text
    print(data)
copy code

Status exception handling

copy code
import requests

URL = ' http://ip.taobao.com/service/getIpInfo.php '   # Taobao IP address library API 
try :
    r = requests.get(URL, params={'ip': '8.8.8.8'}, timeout=1)
    r.raise_for_status() #If the   response status code is not 200, it will actively throw an exception 
except requests.RequestException as e:
     print (e)
 else :
    result = r.json()
    print(type(result), result, sep='\n')
copy code

upload files

Using the request module, you can also upload files, and the type of the file will be processed automatically:

copy code
import requests
 
url = ' http://127.0.0.1:8080/upload ' 
files = { ' file ' : open( ' /home/rxf/test.jpg ' , ' rb ' )}
 # files = {'file': (' report.jpg', open('/home/lyb/sjzl.mpg', 'rb'))} #Explicitly set the file name 
 
r = requests.post(url, files= files)
 print (r.text)
copy code

Request is more convenient, you can upload the string as a file:

copy code
import requests
 
url = ' http://127.0.0.1:8080/upload ' 
files = { ' file ' : ( ' test.txt ' , b ' Hello Requests. ' )}      #Must explicitly set the file name 
 
r = requests.post (url, files= files)
 print (r.text)
copy code

6) Authentication

Basic Authentication (HTTP Basic Auth)

import requests
from requests.auth import HTTPBasicAuth
 
r = requests.get('https://httpbin.org/hidden-basic-auth/user/passwd', auth=HTTPBasicAuth('user', 'passwd'))
# r = requests.get('https://httpbin.org/hidden-basic-auth/user/passwd', auth=('user', 'passwd'))    # 简写
print(r.json())

Another very popular form of HTTP authentication is Digest Authentication, which Requests supports out of the box as well:

requests.get(URL, auth=HTTPDigestAuth('user', 'pass')

Cookies and Session Objects

If a response contains some cookies, you can quickly access them:

import requests
 
r = requests.get('http://www.google.com.hk/')
print(r.cookies['NID'])
print(tuple(r.cookies))

To send your cookies to the server, use the cookies parameter:

copy code
import requests
 
url = ' http://httpbin.org/cookies ' 
cookies = { ' testCookies_1 ' : ' Hello_Python3 ' , ' testCookies_2 ' : ' Hello_Requests ' }
 #Specify spaces, square brackets, parentheses, equal signs, Special symbols such as commas, double quotes, slashes, question marks, @, colons, and semicolons cannot be used as the content of cookies. 
r = requests.get(url, cookies= cookies)
 print (r.json())
copy code

Session objects allow you to persist certain parameters across requests, the most convenient being cookies across all requests made by the same Session instance, and these are handled automatically, which is very convenient.
Here is a real example, the following is the quick disk sign-in script:

copy code
import requests
 
headers = {'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
           'Accept-Encoding': 'gzip, deflate, compress',
           'Accept-Language': 'en-us;q=0.5,en;q=0.3',
           'Cache-Control': 'max-age=0',
           'Connection': 'keep-alive',
           'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:22.0) Gecko/20100101 Firefox/22.0'}
 
s = requests.Session()
s.headers.update(headers)
# s.auth = ('superuser', '123')
s.get('https://www.kuaipan.cn/account_login.htm')
 
_URL = 'http://www.kuaipan.cn/index.php'
s.post(_URL, params={'ac':'account', 'op':'login'},
       data={'username':'****@foxmail.com', 'userpwd':'********', 'isajax':'yes'})
r = s.get(_URL, params={'ac':'zone', 'op':'taskdetail'})
print(r.json())
s.get(_URL, params={'ac':'common', 'op':'usersign'})
copy code

The requests module grabs the source code of the web page and saves it to a file example

This is a basic file save operation, but there are a few issues worth noting here:

1. Install the requests package and enter pip install requests on the command line to install it automatically. Many people recommend using requests, and the built-in urllib.request can also crawl the source code of web pages

2. The encoding parameter of the open method is set to utf-8, otherwise the saved file will be garbled.

3. If you output the grabbed content directly in cmd, various encoding errors will be prompted, so save it to a file for viewing.

4. The with open method is a better way to write, which can release resources after automatic operation

copy code
#! /urs/bin/python3
import requests

''' The requests module grabs the source code of the web page and saves it to a file example ''' 
html = requests.get( " http://www.baidu.com " )
with open('test.txt', 'w', encoding='utf-8') as f:
    f.write(html.text)
    
''' Example of reading a txt file, one line at a time, and saving to another txt file ''' 
ff = open( ' testt.txt ' , ' w ' , encoding= ' utf-8 ' )
with open('test.txt', encoding="utf-8") as f:
    for line in f:
        ff.write(line)
        ff.close()
copy code

Because the data of one line at a time is printed on the command line, there will be an encoding error in Chinese, so one line at a time is read and saved to another file to test whether the reading is normal. (Note that the encoding method is specified when opening)

 

"Auto login" example:

#!/usr/bin/env python
# -*- coding:utf-8 -*-
import requests


# ############## Method One ############### 
"""
# ## 1. First log in to any page and get the cookie
i1 = requests.get(url="http://dig.chouti.com/help/service")
i1_cookies = i1.cookies.get_dict()

# ## 2. The user logs in, carries the last cookie, and the background authorizes the gpsd in the cookie
i2 = requests.post(
    url="http://dig.chouti.com/login",
    data={
        'phone': "8615131255089",
        'password': "xxooxxoo",
        'oneMonth': ""
    },
    cookies=i1_cookies
)

# ## 3. Like (just bring the authorized gpsd)
gpsd = i1_cookies['gpsd']
i3 = requests.post(
    url="http://dig.chouti.com/link/vote?linksId=8589523",
    cookies={'gpsd': gpsd}
)

print(i3.text)
"""


# ############## Method two ############### 
"""
import requests

session = requests.Session()
i1 = session.get(url="http://dig.chouti.com/help/service")
i2 = session.post(
    url="http://dig.chouti.com/login",
    data={
        'phone': "8615131255089",
        'password': "xxooxxoo",
        'oneMonth': ""
    }
)
i3 = session.post(
    url="http://dig.chouti.com/link/vote?linksId=8589523"
)
print(i3.text)

"""
Drawer New Hot List
#!/usr/bin/env python
# -*- coding:utf-8 -*-

import requests
from bs4 import BeautifulSoup

# ############## Method One ###############
#
# # 1. 访问登陆页面,获取 authenticity_token
# i1 = requests.get('https://github.com/login')
# soup1 = BeautifulSoup(i1.text, features='lxml')
# tag = soup1.find(name='input', attrs={'name': 'authenticity_token'})
# authenticity_token = tag.get('value')
# c1 = i1.cookies.get_dict()
# i1.close()
#
# # 1. Carry the authentication_token, username and password and other information, and send user authentication 
# form_data = { 
# "authenticity_token": authenticity_token, 
#      "utf8": "", 
#      "commit": "Sign in", 
#      "login": "[email protected]", 
#      'password': 'xxoo' 
# }
#
# i2 = requests.post('https://github.com/session', data=form_data, cookies=c1)
# c2 = i2.cookies.get_dict()
# c1.update(c2)
# i3 = requests.get('https://github.com/settings/repositories', cookies=c1)
#
# soup3 = BeautifulSoup(i3.text, features='lxml')
# list_group = soup3.find(name='div', class_='listgroup')
#
# from bs4.element import Tag
#
# for child in list_group.children:
#     if isinstance(child, Tag):
#         project_tag = child.find(name='a', class_='mr-1')
#         size_tag = child.find(name='small')
#         temp = "项目:%s(%s); 项目路径:%s" % (project_tag.get('href'), size_tag.string, project_tag.string, )
#         print(temp)



# ############## Method 2 ############### 
# session = requests.Session() 
# # 1. Visit the login page and get the authentication_token 
# i1 = session.get('https://github.com/login') 
# soup1 = BeautifulSoup(i1.text, features='lxml') 
# tag = soup1.find(name='input', attrs={' name': 'authenticity_token'}) 
# authenticity_token = tag.get('value') 
# c1 = i1.cookies.get_dict() 
# i1.close()
#
# # 1. Carry the authentication_token, username and password and other information, and send user authentication 
# form_data = { 
#      "authenticity_token": authenticity_token, 
#      "utf8": "", 
#      "commit": "Sign in", 
#      "login": "[email protected]", 
#      'password': 'xxoo' 
# }
#
# i2 = session.post('https://github.com/session', data=form_data)
# c2 = i2.cookies.get_dict()
# c1.update(c2)
# i3 = session.get('https://github.com/settings/repositories')
#
# soup3 = BeautifulSoup(i3.text, features='lxml')
# list_group = soup3.find(name='div', class_='listgroup')
#
# from bs4.element import Tag
#
# for child in list_group.children:
#     if isinstance(child, Tag):
#         project_tag = child.find(name='a', class_='mr-1')
#         size_tag = child.find(name='small')
#         temp = "项目:%s(%s); 项目路径:%s" % (project_tag.get('href'), size_tag.string, project_tag.string, )
#         print(temp)
github
#!/usr/bin/env python
# -*- coding:utf-8 -*-
import time

import requests
from bs4 import BeautifulSoup

session = requests.Session()

i1 = session.get(
    url = ' https://www.zhihu.com/#signin ' ,
    headers={
        'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.98 Safari/537.36',
    }
)

soup1 = BeautifulSoup(i1.text, ' lxml ' )
xsrf_tag = soup1.find(name='input', attrs={'name': '_xsrf'})
xsrf = xsrf_tag.get('value')

current_time = time.time()
i2 = session.get(
    url='https://www.zhihu.com/captcha.gif',
    params={'r': current_time, 'type': 'login'},
    headers={
        'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.98 Safari/537.36',
    })

with open('zhihu.gif', 'wb') as f:
    f.write(i2.content)

captcha = input( ' Please open the zhihu.gif file, view and enter the verification code: ' )
form_data = {
    "_xsrf": xsrf,
    'password': 'xxooxxoo',
    "captcha": 'captcha',
    'email': '[email protected]'
}
i3 = session.post(
    url='https://www.zhihu.com/login/email',
    data=form_data,
    headers={
        'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.98 Safari/537.36',
    }
)

i4 = session.get(
    url='https://www.zhihu.com/settings/profile',
    headers={
        'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.98 Safari/537.36',
    }
)

soup4 = BeautifulSoup(i4.text, 'lxml')
tag = soup4.find(id='rename-section')
nick_name = tag.find('span',class_='name').string
print(nick_name)
Know almost
#!/usr/bin/env python
# -*- coding:utf-8 -*-
import re
import json
import base64

import rsa
import requests


def js_encrypt(text):
    b64der = 'MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQCp0wHYbg/NOPO3nzMD3dndwS0MccuMeXCHgVlGOoYyFwLdS24Im2e7YyhB0wrUsyYf0/nhzCzBK8ZC9eCWqd0aHbdgOQT6CuFQBMjbyGYvlVYU2ZP7kG9Ft6YV6oc9ambuO7nPZh+bvXH0zDKfi02prknrScAKC0XhadTHT3Al0QIDAQAB'
    der = base64.standard_b64decode(b64der)

    pk = rsa.PublicKey.load_pkcs1_openssl_der(der)
    v1 = rsa.encrypt(bytes(text, 'utf8'), pk)
    value = base64.encodebytes(v1).replace(b'\n', b'')
    value = value.decode('utf8')

    return value


session = requests.Session()

i1 = session.get('https://passport.cnblogs.com/user/signin')
rep = re.compile("'VerificationToken': '(.*)'")
v = re.search(rep, i1.text)
verification_token = v.group(1)

form_data = {
    'input1': js_encrypt('wptawy'),
    'input2': js_encrypt('asdfasdf'),
    'remember': False
}

i2 = session.post(url='https://passport.cnblogs.com/user/signin',
                  data=json.dumps(form_data),
                  headers={
                      'Content-Type': 'application/json; charset=UTF-8',
                      'X-Requested-With': 'XMLHttpRequest',
                      'VerificationToken': verification_token}
                  )

i3 = session.get(url='https://i.cnblogs.com/EditDiary.aspx')

print(i3.text)
Blog Park
#!/usr/bin/env python
# -*- coding:utf-8 -*-

import requests


#Step 1 : Visit the login page, get X_Anti_Forge_Token, X_Anti_Forge_Code 
# 1. Request url: https://passport.lagou.com/login/login.html 
# 2. Request method: GET 
# 3. Request header: 
#     User -agent 
r1 = requests.get( ' https://passport.lagou.com/login/login.html ' ,
                 headers={
                     'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36',
                 },
                 )

X_Anti_Forge_Token = re.findall("X_Anti_Forge_Token = '(.*?)'", r1.text, re.S)[0]
X_Anti_Forge_Code = re.findall( " X_Anti_Forge_Code = '(.*?)' " , r1.text, re.S)[0]
 print (X_Anti_Forge_Token, X_Anti_Forge_Code)
 # print(r1.cookies.get_dict()) #The 
second step : Login 
# 1. Request url: https://passport.lagou.com/login/login.json 
# 2. Request method: POST 
# 3. Request header: 
#     cookie 
#     User-agent 
#     Referer: https://passport .lagou.com/login/login.html 
#     X-Anit-Forge-Code:53165984 
#     X-Anit-Forge-Token:3b6a2f62-80f0-428b-8efb-ef72fc100d78 
#     X-Requested-With:XMLHttpRequest
# 4、请求体:
# isValidate:true
# username:15131252215
# password:ab18d270d7126ea65915c50288c22c0d
# request_form_verifyCode:''
# submit:''
r2 = requests.post(
    'https://passport.lagou.com/login/login.json',
    headers={
        'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36',
        'Referer': 'https://passport.lagou.com/login/login.html',
        'X-Anit-Forge-Code': X_Anti_Forge_Code,
        'X-Anit-Forge-Token': X_Anti_Forge_Token,
        'X-Requested-With': 'XMLHttpRequest'
    },
    data={
        "isValidate": True,
        'username': '15131255089',
        'password': 'ab18d270d7126ea65915c50288c22c0d',
        'request_form_verifyCode': '',
        'submit': ''
    },
    cookies=r1.cookies.get_dict()
)
print(r2.text)
pull hook net

 

refer to:

http://cn.python-requests.org/zh_CN/latest/user/quickstart.html#id4

http://www.python-requests.org/en/master/

http://docs.python-requests.org/en/latest/user/quickstart/

http://www.ifindbug.com/doc/id-48232/name-python-requests-module-study-notes.html#t0

http://www.cnblogs.com/wupeiqi/articles/6283017.html

Tags: Detailed explanation of Python—requests module

Crawler Learning

Related: Detailed explanation of Python—requests module