搜索资源列表
DouBan
- 可以在豆瓣网上爬取相应用户的信息,如:用户ID,,读过什么书,对书的评价等数据,可以作为收集实验数据的工具-Web crawling in豆瓣the corresponding user information, such as: user ID,, read any book, the book' s evaluation of the data can be used as a tool to collect experimental data
ImWeibo_7_9_4
- 新浪微博数据的抓取,测试过可以正常进行微博账号数据的爬取-web sina weiboee
baikeSpider
- python爬取百度百科数据,直接可运行,并包含已爬取的json格式数据。-Baidu Encyclopedia python crawling data
data_crawler
- 定向爬取指定网站数据,多线程同时运行,爬虫编写一般思路-Directional designated website crawling data, multi-threaded run
login
- 豆瓣网数据爬取代码。用于需要登录后爬取。如爬取用户好友数据。-Douban data crawling code. After crawling for the need to log. Friends of data users such as crawling.
Test
- 豆瓣网数据爬取。使用beautifulsoup方式爬取。-Douban data crawling. Use beautifulsoup way crawling.
Frequency
- 这是已经封装好的exe程序,可以获取指定txt文本中词频统计信息,方便爬取数据-It is already packaged exe program, you can get the text to specify txt word frequency statistics to facilitate crawling data
GetStockRealData
- 爬取股票的交易信息,可以设置时间每天自动更新,并存入数据库,适合关于股票的一些研究做数据准备-Crawling stock trading information, you can set the time automatically updated daily and stored in the for stocks to do some research on data preparation
example-spider
- 批量爬取网页源码中的有用数据,如图片地址和文字信息等,十分方便-information , automatic crawl a web page is very convenient
MyWorkForIFeng
- 爬虫程序,爬取新闻文本数据,代码可以改写,改写也比较容易。有助于新手学习。-Crawlers, crawling news text data, the code can be rewritten, rewriting easier. It helps novices to learn.
大数据爬虫
- 实现网页信息爬取,爬取您需要的信息,实现数据获取分析功能(Implementation of web crawling)
知乎爬虫任务与示范
- 利用python编写的知乎数据爬取程序,包括提问,精答,精彩评论,以及自动登录知乎网站的相关代码,局哟普一定的参考价值(Using Python to write data crawling program, including questions, answers, wonderful comments, as well as automatic login know the relevant code of the website, bureau general reference valu
webmagic
- webmagic实现网络爬取,java代码实现(Network crawling by webmagic)
anhuishengkongqizhiliang
- 对安徽省环保厅中的空气质量数据进行爬取,得到安徽省空气质量数据(The air quality data of Anhui provincial Environmental Protection Office of crawling, Anhui province air quality data)
合肥空气质量数据爬取-test-2keyong
- 打开安徽省环保厅,点击合肥空气质量数据,复制网址到代码中,点击运行得到合肥空气质量数据(Open the Anhui environmental protection office, click Hefei air quality data, copy the web site to the code, click Run to get Hefei air quality data)
sohugupiao
- 基于搜狐财经API的股票数据爬取函数,结果为十列数据“十列数据,含义:日期,开盘价,收盘价,涨跌额,涨跌幅,最低价,最高价,成交量,成交额,换手率”。(Based on Sohu financial API's stock data crawling function, the result is ten columns of data "ten columns of data, meaning: date, opening price, closing price, rise and
creeper
- 基于python语言的网络爬虫程序,用于数据爬取(Python - based web crawler for data crawl)
爬取热门微博评论并进行数据分析、nlp情感分析
- 爬取热门微博评论并进行数据分析、nlp情感分析 xuenlp.py功能包含: 读取数据库并进行数据去重 对微博评论进行情感分析并生成统计结果 统计微博评论中的表情排行 统计微博评论中的粉丝排行前20(Crawl popular microblog comments and do data analysis and NLP sentiment analysis Xuenlp.py functions include: Read the database and de-duplicat
广州市道路数据(2018年11月)
- 利用高德地图爬取的广州道路数据,可以用gis打开(Guangzhou road data crawled by Gaode map can be opened by GIS)
知识产权官方微博数据
- 通过利用微博提供的接口和模拟用户登录的方式,爬取相关的知识产权官方微博数据。(By using the interface provided by microblog and the way of simulating user login, crawling relevant intellectual property official microblog data.)