pythonurllib2沒有了
1. python的httplib,urllib和urllib2的區別及用
urllib和urllib2
urllib 和urllib2都是接受URL請求的相關模塊,但是urllib2可以接受一個Request類的實例來設置URL請求的headers,urllib僅可以接受URL。
這意味著,你不可以偽裝你的User Agent字元串等。
urllib提供urlencode方法用來GET查詢字元串的產生,而urllib2沒有。這是為何urllib常和urllib2一起使用的原因。
目前的大部分http請求都是通過urllib2來訪問的
httplib
httplib實現了HTTP和HTTPS的客戶端協議,一般不直接使用,在python更高層的封裝模塊中(urllib,urllib2)使用了它的http實現。
2. 為什麼我下載的Python3.6,urllib包裡面沒有urlopen方法
Python3.x以上版本里的urllib模塊已經發生改變,此處的urllib都應該改成urllib.request。
例如要寫成這樣:
import urllib.request
web = urllib.request.urlopen('http://www..com')
f = web.read()
print(f)
3. python3中使用urllib進行https請求
剛入門python學習網路爬蟲基礎,我使用的python版本是python3.6.4,學習的教程參考 Python爬蟲入門教程
python3.6的版本已經沒有urllib2這個庫了,所以我也不需要糾結urllib和urllib2的區別和應用場景
參考這篇官方文檔 HOWTO Fetch Internet Resources Using The urllib Package 。關於http(s)請求一般就get和post兩種方式較為常用,所以寫了以下兩個小demo,url鏈接隨便找的,具體場景具體變化,可參考注釋中的基本思路
POST請求:
GET請求:
注意,
使用ssl創建未經驗證的上下文,在urlopen中需傳入上下文參數
urllib.request.urlopen(full_url, context=context)
這是Python 升級到 2.7.9 之後引入的一個新特性,所以在使用urlopen打開https鏈接會遇到如下報錯:
ssl.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:777)
所以,當使用urllib.urlopen打開一個 https 鏈接時,需要先驗證一次 SSL 證書
context = ssl._create_unverified_context()
或者或者導入ssl時關閉證書驗證
ssl._create_default_https_context =ssl._create_unverified_context
4. python3.4沒有urllib2怎麼辦
python 3.x中urllib庫和urilib2庫合並成了urllib庫。
其中urllib2.urlopen()變成了urllib.request.urlopen()
urllib2.Request()變成了urllib.request.Request()
5. python3.4中urllib 有沒有urlencode函數
python3.x中urlencode在urllib.parse模塊中
使用方式urllib.parse.urlencode
urllib.parse.urlencode(query,
doseq=False, safe='', encoding=None,
errors=None, quote_via=quote_plus)
Convert a mapping object or a sequence of two-element tuples, which may
contain str
or bytes objects, to a 「percent-encoded」 string. If the
resultant string is to be used as a data for POST operation with urlopen() function, then it should be properly
encoded to bytes, otherwise it would result in a TypeError.
The resulting string is a series of key=value pairs separated by '&' characters, where
both key and value are quoted using the quote_via
function. By default, quote_plus() is used to quote the values, which
means spaces are quoted as a '+' character and 『/』 characters are encoded as %2F, which follows the
standard for GET requests (application/x-www-form-urlencoded). An alternate
function that can be passed as quote_via is quote(), which will encode spaces as %20 and not encode 『/』
characters. For maximum control of what is quoted, use quote and specify a value
for safe.
When a sequence of two-element tuples is used as the query argument,
the first element of each tuple is a key and the second is a value. The value
element in itself can be a sequence and in that case, if the optional parameter
doseq is evaluates to True, indivial key=value pairs separated
by '&' are
generated for each element of the value sequence for the key. The order of
parameters in the encoded string will match the order of parameter tuples in the
sequence.
The safe, encoding, and errors parameters are
passed down to quote_via (the encoding and errors
parameters are only passed when a query element is a str).
To reverse this encoding process, parse_qs() and parse_qsl() are provided in this mole to parse
query strings into Python data structures.
Refer to urllib examples to
find out how urlencode method can be used for generating query string for a URL
or data for POST.
Changed in version 3.2: Query parameter
supports bytes and string objects.
6. python的httplib,urllib和urllib2的區別及用
整體來說,urllib2是urllib的增強,但是urllib中有urllib2中所沒有的函數。 urllib2可以用urllib2.openurl中設置Request參數,來修改Header頭。如果你訪問一個網站,想更改User Agent(可以偽裝你的瀏覽器),你就要用urllib2. urllib支持設置