爬虫以外用到的一些库。

random 随机数

示例:

import random

# 范围内随机 int
print(random.randint(1,10))

# 范围内随机 float
print(random.uniform(0.1,2.2))

# 0-1 之间的 float
print(random.random())

# 范围内步长为2的 int
print(random.randrange(1,10,2))

pillow 图片处理

安装:

pip install pillow

裁剪:

# 左上右下,而非 x, y, w, h
pic.crop((left, upper, right, lower))

参考:https://www.liaoxuefeng.com/wiki/1016959663602400/1017785454949568

转换编码

from urllib.parse import quote

quote(name, safe='') # 转为 URL 安全编码

os.path 文件和路径处理

os.path.split(path)  # 分割文件夹路径和文件名
os.path.splitext(path)  # 分割后缀和其以外的内容
os.path.normcase(path)  # 统一小写和斜杠
os.path.normpath(path)  # 规范路径形式
os.path.join(path1[, path2[, ...]])  # 合并路径

print(os.path.getatime(file))  # 输出最近访问时间
print(os.path.getctime(file))  # 输出文件创建时间
print(os.path.getmtime(file))  # 输出最近修改时间
print(time.gmtime(os.path.getmtime(file)))  # 以struct_time形式输出最近修改时间
print(os.path.getsize(file))  # 输出文件大小(字节为单位)
print(os.path.abspath(file))  # 输出绝对路径
print(os.path.normpath(file))  # 规范path字符串形式

参考:https://www.runoob.com/python/python-os-path.html

示例:读取文件列表

with os.scandir(path) as files:
        for file in files:
            ...

datetime 时间处理

基本内容:

import datetime

now = datetime.datetime(2021, 3, 5, 7, 32, 10)

print('年:', now.year)
print('月:', now.month)
print('日:', now.day)
print('时:', now.hour)
print('分:', now.minute)
print('秒:', now.second)
print('')

print('当前日期:', now.date())
print('当前时间:', now.time())
print('')

print('返回struct_time:', now.timetuple())
print('返回UTC的struct_time:', now.utctimetuple())
print('返回的公历序列数:', now.toordinal())
print('返回标准日期格式:', now.isoformat())
print('返回的周几(1表示周一):', now.isoweekday()) 
print('返回的周几(0表示周一):', now.weekday())
print('')

print('格式化时间:', now.strftime('%Y/%m/%d %H:%M:%S'))
print(datetime.datetime.strptime('2021/03/05 7:32:00', '%Y/%m/%d %H:%M:%S'))

加减运算:

import datetime

now = datetime.date.today()
before_5_date = now + datetime.timedelta(days=-5)

print('now date is:', now)
print('before five days date is:', before_5_date)

now_time = datetime.datetime.now()
after_5_hours_10_minutes = now_time + datetime.timedelta(hours=5, minutes=10)

print(now_time)
print(after_5_hours_10_minutes)

参考这里官方文档

pymssql 连接 SQL Server

基本示例:

from os import getenv
import pymssql

server = getenv("PYMSSQL_TEST_SERVER")
user = getenv("PYMSSQL_TEST_USERNAME")
password = getenv("PYMSSQL_TEST_PASSWORD")

conn = pymssql.connect(server, user, password, "tempdb")
cursor = conn.cursor()
cursor.execute("""
IF OBJECT_ID('persons', 'U') IS NOT NULL
    DROP TABLE persons
CREATE TABLE persons (
    id INT NOT NULL,
    name VARCHAR(100),
    salesrep VARCHAR(100),
    PRIMARY KEY(id)
)
""")
cursor.executemany(
    "INSERT INTO persons VALUES (%d, %s, %s)",
    [(1, 'John Smith', 'John Doe'),
     (2, 'Jane Doe', 'Joe Dog'),
     (3, 'Mike T.', 'Sarah H.')])
# you must call commit() to persist your data if you don't set autocommit to True
conn.commit()

cursor.execute('SELECT * FROM persons WHERE salesrep=%s', 'John Doe')
row = cursor.fetchone()
while row:
    print("ID=%d, Name=%s" % (row[0], row[1]))
    row = cursor.fetchone()

conn.close()

参考:http://www.pymssql.org/pymssql_examples.html