如何下载外国上市公司财务报表(如何查询上市公司的财务报表)

前言

用VBA做了个小工具，可以批量把某网站上的上市公司的财报下下来。

制作思路非常简单：

1、从该网站上的下载链接中找到规律：都是一段@#￥%……&开头加上想要下载的报表类型（BS，ER，SCF），加上会计期间（按年度、按报告期、按季度），再加上上市公司代码。

2、然后用Excel表格排列组合生成那些下载链接，并访问。这里没有直接用get/post，因为会被网站识别出来并拒绝访问，下载下来的文件就是空的。然后我就用了个比较笨的办法，调用IE去访问这些网址，然后用VBA的Sendkeys方法模拟手工点击下载按钮。

运行之后没有被block掉，可行

感觉像是在用手榴弹炸直升机，而且用VBA有个很大的缺陷：如果IE不能正确地出现在桌面的最前台（比如微信突然弹出个消息……），这样会导致application.sendkeys方法失效。

这里用Python再做一个，但还是避免不了要借用一下VBA里面的一些功能……

思路和上面是一样的，根据网站规则组装好下载链接后调用IE去下载。

不一样的是，这次没有模拟快捷键，而是用pyautogui的图像识别功能去调用鼠标点击下载按钮。用这种方法的好处是不容易被弹窗打断，命中率更高。

Python代码：

""" 运行前提 1、确保“从THS批量下载上市公司财报.py”、“从THS批量下载上市公司财报.xlsm”、“capture.png”三个文件存放在同一目录下 2、确保安装了以下几个第三方库：pyautogui、pywin32、pandas、xlwings 3、确保把IE的默认下载路径改成本文件所在目录 4、下载后立即最小化运行窗口，避免遮挡屏幕导致pyautogui无法定位图像位置没有安装的话可以在命令提示符里面输入“pip install 库名”进行安装： pip install pyautogui pip install pywin32 pip install pandas pip install xlwings """ import pyautogui from win32com.client import DispatchEx import pandas import xlwings import time import os # 同花顺网站下载{链接的固定字段:文件名固定字段} ref = {main&type=report: main_report.xls, main&type=year: main_year.xls, main&type=simple: main_simple.xls, debt&type=report: debt_report.xls, debt&type=year: debt_year.xls, benefit&type=report: benefit_report.xls, benefit&type=year: benefit_year.xls, benefit&type=simple: benefit_simple.xls, cash&type=report: cash_report.xls, cash&type=year: cash_year.xls, cash&type=simple: cash_simple.xls} df = pandas.read_excel(r./从THS批量下载上市公司财报.xlsm, sheet_name=Main, dtype=str, header=0) # 把表格中不足6位的公司代码补全成6位，比如把2补全成000002 df[公司代码] = df[公司代码].apply(lambda x: str(000000)[:(6 - len(x))] str(x)) # 创建一个{下载文件名：下载链接}的字典 urls = {} for i in ref.keys(): for j in df[公司代码]: # 如果原来下载过就略过以节约时间 if not os.path.exists(f{j}_{ref[i]}): urls[f{j}_{ref[i]}] = str(fhttp://basic.10jqka.com.cn/api/stock/export.php?export={i}&code={j}) wb = xlwings.App(visible=False, add_book=False).books.open(r./从THS批量下载上市公司财报.xlsm) # 这里用Excel宏的一个退出IE功能,比win32com的功能更好用 # 也是因为本人技术太渣，不知道怎么用Python完全退出IE # QuitIE的宏代码在后面 QuitIE = wb.macro(QuitIE) # 这里用Excel宏的XMLHTTP功能，下载效率更高 # XMLHTTP的宏代码在后面 XMLHTTP = wb.macro(XMLHTTP) # IE浏览器下载button的截图 img = r./capture.png def IEDownload(url): ie = DispatchEx(InternetExplorer.Application) ie.Navigate(url) # 最多尝试查找5次，避免死循环 times = 0 while times < 5: location = pyautogui.locateCenterOnScreen(img, confidence=0.9) if location is not None: pyautogui.click(location.x, location.y, clicks=1, button=left, duration=0.01, interval=0.01) break times = 1 windows = 0 for filename in urls: # 每7次调用一次xmlhttp，胆子大可以把这个值设小点 if windows % 7 == 0: XMLHTTP(filename, urls[filename]) if not os.path.exists(f./{filename}): IEDownload(urls[filename]) windows = 1 else: IEDownload(urls[filename]) windows = 1 # 每7次关闭IE的所有窗口，释放内存 if windows % 7 == 0: time.sleep(0.05) QuitIE() time.sleep(0.05) QuitIE() wb.close() xlwings.App().quit() # 以下是可选功能，把xls格式的文件转成最新的xlsx格式 # if not os.path.exists(./xlsx格式文件): # os.mkdir(./xlsx格式文件) # for i in os.listdir(.): # if not os.path.exists(f./xlsx格式文件/{i}x) and i.endswith(xls): # df=pandas.read_excel(f./{i},header=1,index_col=0) # df.to_excel(f./xlsx格式文件/{i}x)

如何下载外国上市公司财务报表(如何查询上市公司的财务报表)

友情链接百度权重≥5符合友链交换

联系我们

如何下载外国上市公司财务报表(如何查询上市公司的财务报表)

相关推荐

友情链接 百度权重≥5符合友链交换

联系我们

友情链接百度权重≥5符合友链交换