搜索资源列表
NiceWords
- Nicewords是由工作在顶级门户网站的几名资深高级工程师利用爬虫技术(蜘蛛机器人,spider)、分词技术和网页萃取技术,利用URL重写技术、缓存技术,使用PHP语言开发的一套能根据设置的关键词自动抓取互联网上的相关信息、自动更新的WEB智能建站系统。利用NiceWords智能建站系统,只需要在配置页面上设置几个关键词,NiceWords就能全自动的生成一套能自动更新的网站了。 您要做的仅仅是设置几个关键词,其他的一切交给NiceWords来完成! -Nicewords is the top
shell.tar
- Spider程序:shell编程,实现文件内容的逐行读取,并抓取种子节点开始的网页,4层深度-Spider programs: shell programming, the contents of the file line by line read, and crawl seed nodes in the beginning pages, 4-layer depth
smspro
- User利用手機輸入關鍵字後,將簡訊傳送至SMS Server,此程式會抓取User傳送至Server上的簡訊檔,並與資料txt作比對抓取,並透過SMS Server API傳送簡訊給user手機-User keywords using a mobile phone, it will send SMS to SMS Server, this program will crawl User newsletter sent to the Server on the file and txt as c
memcpy_mck
- Align dest to nearest 8-byte boundary. We know we have at least 7 bytes to copy, enough to crawl to 8-byte boundary. Actual number of byte to crawl depend on the dest alignment. 7 byte or less is taken care at .memcpy_short.
scribe
- scribe:facebook用来抓取日志的一个文件,文档包含分布式安装scribe,和tomcat7、tomcat8集成的成功案例。- scribe: a log file facebook used to crawl, the document contains a distributed installation scribe, and tomcat7, tomcat8 integration success stories.
xici_proxy
- 爬取西刺前10页(可自行修改参数total_page来管理爬取的页数)有效期大于1天的高匿代理IP,并测试其有效性,最后保存为Proxies.json文件(Unicode),使用时导入文件随机选取一个代理ip使用即可.(Crawl up to 10 pages before the Western thorn, which can modify the parameter total_page to manage the page number of climbing. The high hid
