pythonurllib302

发布时间: 2022-07-25 05:08:33

① python中的urllib2的302跳转怎么处理cookie

cookie只是HTTP头中的键值对，你可以手动去请求前赋值，请求完保存。。。

或者使用包可实现：

importcookielib,urllib2
cj=cookielib.CookieJar()
opener=urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
r=opener.open("http://example.com/")

这里cj就会自己去实现请求完存储，和请求前赋值了

官方文档：https://docs.python.org/2/library/cookielib.html

② python urllib2模块在哪里下载

urllib2是python自带的模块，不需要下载。

urllib2在python3.x中被改为urllib.request

③ 如何在Python中使用urllib2

urllib和urllib2urllib和urllib2都是接受URL请求的相关模块，但是urllib2可以接受一个Request类的实例来设置URL请求的headers，urllib仅可以接受URL。这意味着，你不可以伪装你的UserAgent字符串等。urllib提供urlencode方法用来GET查询字符串的产生，而urllib2没有。这是为何urllib常和urllib2一起使用的原因。目前的大部分http请求都是通过urllib2来访问的httplibhttplib实现了HTTP和HTTPS的客户端协议，一般不直接使用，在python更高层的封装模块中（urllib,urllib2）使用了它的http实现。

④ python3.4没有urllib2怎么办

python 3.x中urllib库和urilib2库合并成了urllib库。

其中urllib2.urlopen()变成了urllib.request.urlopen()

urllib2.Request()变成了urllib.request.Request()

⑤ 求助python3 302重定向鎐ookie问题

通过昨天写的python脚本，我已经注册激活了50个box.net账号，用作上传文件。
今天我继续写代码，用来自动登录box.net并获取所有文件的分享链接。
不过测试的时候出现了点问题，账号信息正确，但总是登录不成功。
headers中referer、user-agent都有伪造，cookie也有发送。
通过设置debuglevel=1跟踪http请求，最终发现了问题:

1
2
3

httpHandler = urllib2.HTTPHandler(debuglevel=1)
httpsHandler = urllib2.HTTPSHandler(debuglevel=1)
self.opener = urllib2.build_opener(httpHandler, httpsHandler)

urllib2很聪明，在发现HttpResponse中有重定向(301, 302)时会自动转向请求这个新的URL，
但urllib2有个严重的问题，它没有带着cookie去请求新的URL。
这也是说，前期我们通过一个POST请求来获取cookie（对应着服务器上的session），
但urllib2却没有带着必要的cookie去访问需要授权的页面。
一开始我是想直接用httplib的，考虑到前后一致性才全部用urllib2，结果urllib2又出问题。。。
解决这个问题，可以：
1. 换httplib来实现，它不会像urllib2会自动处理重定向，cookie不会丢
2. 截获重定向，禁止urllib2自动处理
我选择了重写urllib2.HTTPRedirectHandler的http_error_302方法，截获302，让urllib2不再处理302:

1
2
3

class HttpRedirect_Handler(urllib2.HTTPRedirectHandler):
def http_error_302(self, req, fp, code, msg, headers):
pass

然后在urllib2.build_opener方法中用HttpRedirect_Handler的一个实例做参数，例如:

1
2

self.opener = urllib2.build_opener(HttpRedirect_Handler(),
urllib2.HTTPCookieProcessor(self.cookie))

这样，当我们用上述opener去POST登录时，遇到302就不会再自动转向了，
登录成功获取到的cookie也不会丢。
后面再带着self.cookie去请求需要授权的页面，就可以获取到正确的内容了。

⑥ python urllib2的用法

urllib2 默认会使用环境变量 http_proxy 来设置 HTTP Proxy。如果想在程序中明确控制 Proxy 而不受环境变量的影响，可以使用下面的方式:
import urllib2
enable_proxy = True
proxy_handler = urllib2.ProxyHandler({"http" : 'IP:8080'})
null_proxy_handler = urllib2.ProxyHandler({})
if enable_proxy:
opener = urllib2.build_opener(proxy_handler)
else:
opener = urllib2.build_opener(null_proxy_handler)
urllib2.install_opener(opener)
这里要注意的一个细节，使用 urllib2.install_opener() 会设置 urllib2 的全局 opener 。这样后面的使用会很方便，但不能做更细粒度的控制，比如想在程序中使用两个不同的 Proxy 设置等。比较好的做法是不使用 install_opener 去更改全局的设置，而只是直接调用 opener 的 open 方法代替全局的 urlopen 方法。

⑦ 为什么python使用urllib2这里会出现错误

表面现象看起来是，你发送的地址给google服务器，但是此地址有问题，导致人家返回你错误，说是：
HTTP Error 302: The HTTP server returned a redirect error that would lead to an infinite loop.

而此地址，看起来，你是从浏览器之类的地方，拷贝过来的。
但是，实际上，如果需要程序模拟此过程的话，需要找到此地址中的各种参数，即：
num=100
hl=zh-CN
newwindow=1
safe=strict
q=inurl%3Aadmin_login.aspx
oq=inurl%3Aadmin_login.aspx
gs_l=serp.3...125521.131943.0.132041.38.31.1.0.0.3.209.2367.23j3j1.27.0...0.0...1c.1.bvH-WnKtKjg
中每个参数的值，是如何获得的，
然后再用程序去模拟过程，生成对应的参数，
然后才是去将此地址发送给人家的google的服务器，才能获得你所想要的结果的。

总之一句话，先要自己搞懂内部执行的过程，然后才是用程序模拟此过程。

⑧ python的httplib，urllib和urllib2的区别及用

他们的区别
urllib和urllib2
urllib 和urllib2都是接受URL请求的相关模块，但是urllib2可以接受一个Request类的实例来设置URL请求的headers，urllib仅可以接受URL。
这意味着，你不可以伪装你的User Agent字符串等。
urllib提供urlencode方法用来GET查询字符串的产生，而urllib2没有。这是为何urllib常和urllib2一起使用的原因。
目前的大部分http请求都是通过urllib2来访问的

httplib
httplib实现了HTTP和HTTPS的客户端协议，一般不直接使用，在python更高层的封装模块中（urllib,urllib2）使用了它的http实现。

urllib简单用法
urllib.urlopen(url[, data[, proxies]]) :

详细使用方法见
urllib学习

urllib2简单用法
最简单的形式
import urllib2
response=urllib2.urlopen('http://www.douban.com')
html=response.read()

实际步骤：
1、urllib2.Request()的功能是构造一个请求信息，返回的req就是一个构造好的请求
2、urllib2.urlopen()的功能是发送刚刚构造好的请求req，并返回一个文件类的对象response，包括了所有的返回信息。
3、通过response.read()可以读取到response里面的html，通过response.info()可以读到一些额外的信息。
如下：
#!/usr/bin/env python
import urllib2
req = urllib2.Request("http://www.douban.com")
response = urllib2.urlopen(req)
html = response.read()
print html

有时你会碰到，程序也对，但是服务器拒绝你的访问。这是为什么呢?问题出在请求中的头信息(header)。有的服务端有洁癖，不喜欢程序来触摸它。这个时候你需要将你的程序伪装成浏览器来发出请求。请求的方式就包含在header中。
常见的情形：

import urllib
import urllib2
url = 'http://www.someserver.com/cgi-bin/register.cgi'
user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'# 将user_agent写入头信息
values = {'name' : 'who','password':'123456'}
headers = { 'User-Agent' : user_agent }
data = urllib.urlencode(values)
req = urllib2.Request(url, data, headers)
response = urllib2.urlopen(req)
the_page = response.read()

values是post数据
GET方法
例如网络：
网络是通过http://www..com/s?wd=XXX 来进行查询的，这样我们需要将{‘wd’:’xxx’}这个字典进行urlencode

#coding:utf-8
import urllib
import urllib2
url = 'http://www..com/s'
values = {'wd':'D_in'}
data = urllib.urlencode(values)
print data
url2 = url+'?'+data
response = urllib2.urlopen(url2)
the_page = response.read()
print the_page

POST方法

import urllib
import urllib2
url = 'http://www.someserver.com/cgi-bin/register.cgi'
user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)' //将user_agent写入头信息
values = {'name' : 'who','password':'123456'} //post数据
headers = { 'User-Agent' : user_agent }
data = urllib.urlencode(values) //对post数据进行url编码
req = urllib2.Request(url, data, headers)
response = urllib2.urlopen(req)
the_page = response.read()

urllib2带cookie的使用

#coding:utf-8
import urllib2,urllib
import cookielib

url = r'http://www.renren.com/ajaxLogin'

#创建一个cj的cookie的容器
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
#将要POST出去的数据进行编码
data = urllib.urlencode({"email":email,"password":pass})
r = opener.open(url,data)
print cj

httplib简单用法
简单示例

#!/usr/bin/env python
# -*- coding: utf-8 -*-
import httplib
import urllib

def sendhttp():
data = urllib.urlencode({'@number': 12524, '@type': 'issue', '@action': 'show'})
headers = {"Content-type": "application/x-www-form-urlencoded",
"Accept": "text/plain"}
conn = httplib.HTTPConnection('bugs.python.org')
conn.request('POST', '/', data, headers)
httpres = conn.getresponse()
print httpres.status
print httpres.reason
print httpres.read()

if __name__ == '__main__':
sendhttp()

具体用法见
httplib模块
python 3.x中urllib库和urilib2库合并成了urllib库。其中、
首先你导入模块由
import urllib
import urllib2
变成了
import urllib.request

然后是urllib2中的方法使用变成了如下
urllib2.urlopen()变成了urllib.request.urlopen()
urllib2.Request()变成了urllib.request.Request()

urllib2.URLError 变成了urllib.error.URLError
而当你想使用urllib 带数据的post请求时，
在python2中
urllib.urlencode(data)

而在python3中就变成了
urllib.parse.urlencode(data)

脚本使用举例：
python 2中

import urllib
import urllib2
import json
from config import settings
def url_request(self, action, url, **extra_data): abs_url = "http://%s:%s/%s" % (settings.configs['Server'],
settings.configs["ServerPort"],
url)
if action in ('get', 'GET'):
print(abs_url, extra_data)
try:
req = urllib2.Request(abs_url)
req_data = urllib2.urlopen(req, timeout=settings.configs['RequestTimeout'])
callback = req_data.read()
# print "-->server response:",callback
return callback

except urllib2.URLError as e:
exit("\033[31;1m%s\033[0m" % e)
elif action in ('post', 'POST'):
# print(abs_url,extra_data['params'])
try:
data_encode = urllib.urlencode(extra_data['params'])
req = urllib2.Request(url=abs_url, data=data_encode)
res_data = urllib2.urlopen(req, timeout=settings.configs['RequestTimeout'])
callback = res_data.read()
callback = json.loads(callback)
print("\033[31;1m[%s]:[%s]\033[0m response:\n%s" % (action, abs_url, callback))
return callback
except Exception as e:
print('---exec', e)
exit("\033[31;1m%s\033[0m" % e)

python3.x中

import urllib.request
import json
from config import settings

def url_request(self, action, url, **extra_data):
abs_url = 'http://%s:%s/%s/' % (settings.configs['ServerIp'], settings.configs['ServerPort'], url)
if action in ('get', 'Get'): # get请求
print(action, extra_data)try:
req = urllib.request.Request(abs_url)
req_data = urllib.request.urlopen(req, timeout=settings.configs['RequestTimeout'])
callback = req_data.read()
return callback
except urllib.error.URLError as e:
exit("\033[31;1m%s\033[0m" % e)
elif action in ('post', 'POST'): # post数据到服务器端
try:
data_encode = urllib.parse.urlencode(extra_data['params'])
req = urllib.request.Request(url=abs_url, data=data_encode)
req_data = urllib.request.urlopen(req, timeout=settings.configs['RequestTimeout'])
callback = req_data.read()
callback = json.loads(callback.decode())
return callback
except urllib.request.URLError as e:
print('---exec', e)
exit("\033[31;1m%s\033[0m" % e)

settings配置如下：

configs = {
'HostID': 2,
"Server": "localhost",
"ServerPort": 8000,
"urls": {

'get_configs': ['api/client/config', 'get'], #acquire all the services will be monitored
'service_report': ['api/client/service/report/', 'post'],

},
'RequestTimeout': 30,
'ConfigUpdateInterval': 300, # 5 mins as default

}

⑨ 有什么会导致Python urllib2网页请求无限等待

可能的原因有很多, 比如无限302跳等等, 这个你需要做更多测试

⑩ python3.5 urllib.request 怎么禁止301，302跳转

Python code?

class SmartRedirectHandler(urllib2.HTTPRedirectHandler): 1
def http_error_301(self, req, fp, code, msg, headers):
result = urllib2.HTTPRedirectHandler.http_error_301( 2
self, req, fp, code, msg, headers)
result.status = code 3
return result

def http_error_302(self, req, fp, code, msg, headers): 4
result = urllib2.HTTPRedirectHandler.http_error_302(
self, req, fp, code, msg, headers)
result.status = code
return result

阅读全文

热点内容

java返回this 发布：2025-10-20 08:28:16 浏览：721

制作脚本网站发布：2025-10-20 08:17:34 浏览：987

python中的init方法发布：2025-10-20 08:17:33 浏览：694

图案密码什么意思发布：2025-10-20 08:16:56 浏览：849

怎么清理微信视频缓存发布：2025-10-20 08:12:37 浏览：753

c语言编译器怎么看执行过程发布：2025-10-20 08:00:32 浏览：1093

邮箱如何填写发信服务器发布：2025-10-20 07:45:27 浏览：324

shell脚本入门案例发布：2025-10-20 07:44:45 浏览：201

怎么上传照片浏览上传发布：2025-10-20 07:44:03 浏览：890

python股票数据获取发布：2025-10-20 07:39:44 浏览：850

pythonurllib302

与pythonurllib302相关的资讯