搜索资源列表
pc4.5.tar
- 此代码是用c语言编写的决策树的c4.5代码,它是数据挖掘分类算法中的一种,可以对给定数据集进行分类,挖掘出规则-this code is c language of the decision tree Bank code, which is data mining classification algorithm of a can of a given data set for classification, tapping rules
Clustering.zip
- 数据挖掘算法的实现,基于模糊聚类的最大树算法,数据集是darpa99,也就是KDD-CUP99中采用的数据集,The realization of data mining algorithms, based on fuzzy clustering of the largest tree algorithm, a data set is darpa99, which is used in KDD-CUP99 data set
Sort
- 对给定数据集合进行分类。是数据挖掘算法的KNN的分类实例。-For a given data set to classify. Data mining algorithms is the classification of examples of KNN.
ENCLUS
- Entropy Based Subspace Clustering for Mining Data - ENCLUS - a new version of PROCLUS algorithm for clustering high dimensional data set.-Entropy Based Subspace Clustering for Mining Data- ENCLUS- a new version of PROCLUS algorithm for clustering hi
Data_gen-by-IBM
- IBM随机数据生成器,是数据挖掘的辅助工具,可以根据输入的参数随机产生指定大小的数据集-IBM random data generator, the auxiliary data mining tools, based on randomly generated input parameters specify the size of the data set
Nnetmod
- 数据挖掘后向传播分类算法matlab实现,无测试与训练数据集-After the dissemination of data mining classification algorithm matlab to implement, test and training data set without
DMProject
- 数据挖掘算法KNN、K-means的实现与挖掘Iris数据集的结果分析-Data mining algorithms, KNN, the K-means to achieve mining Iris data set
Intrusion-Detection
- The problem of intrusion detection has been studied and received a lot of attention in machine learning and data mining in the literature survey. The existing techniques are not effective to improve the classification accuracy and to reduce high
MSapriori
- 多最小支持度关联规则挖掘算法,数据集为T10I4D100K,多最小支持度阈值文件为MS-change-Multiple minimum supports association rule mining algorithm, the data set is T10I4D100K, more than the minimum support threshold file for the MS-change
MSML
- 支持多最小支持度多层次的关联规则挖掘,数据集为T10I4D100K,多最小支持度阈值为MSchange-Support multiple minimum supports multi-level association rule mining, data set T10I4D100K, more than the minimum support threshold MSchange
good-fpgrowth
- 本人已经调通的FP-TREE算法,VS2010下C++程序。该算法结构输出置信度和支持度,给出的源程序种包含测试数据fpdata.dat。调试了好久才调通。一定要注意:数据集里每一行数据,不能有重读出现的数字。这也是FP-TREE数据挖掘算法的的基本要求。数据集默认是.dat文件,也可以是txt-FP-TREE algorithm I has been transferred, VS2010 under C++ program. The output of the algorithm struc
Wine-Quality-Data-Set
- 红酒、白酒质量数据集,可作为机器学习中的数据挖掘数据库-Red wine, white wine quality data sets can be used as data mining machine learning database
Forest-Fires-Data-Set
- 森林火灾数据集,可作为 数 据 挖 掘 的数据库-Forest fires dataset can be used as data mining database
dbscan
- 数据挖掘的作业,基于DBSCAN的分类,数据集为iris,可以学习下-Data mining, classification based on DBSCAN, the data set is iris, you can learn under
muskSuccess
- 数据挖掘与机器学习中可以用于多示例学习的原始数据集MUSK1 已经转化成ARFF格式可直接用于weka的-Data mining and machine learning the original data set can be used for multi-instance learning MUSK1 has been transformed into ARFF format it can be used directly in weka
MUSK2success
- Data mining and machine learning the original data set can be used for multi-instance learning MUSK2 has been transformed into ARFF format it can be used directly in weka-Data mining and machine learning the original data set can be used for multi-in
recsys-challenge2015
- 本代码实现了 recsys challenge2015数据集分析的算法,对数据挖掘和推荐系统的学习很有帮助~!-This code implements recommend algorithm based on recsys challenge2015 data set , which definitly would helpful for studying Data mining and Recommendation system !just enjoy
data--preprocessing-using-kdd-data-set
- Data Mining process model selected is KDD which starts selection of data.Initially the researcher has taken the Kddcup.data-10-perecnt which contains total of 311,027 records which includes both labeled and unlabeled records-Data Mining process model
Geolife Data 1.3
- Geolife GPS 轨迹数据集–用户指南 这一 GPS 轨迹数据集是在 (微软研究亚洲) Geolife 项目中收集的, 178 用户在四年 (2007年4月至 2011年10月) 期间。该数据集的 GPS 轨迹由一个时间戳点序列表示, 每一个都包含纬度、经度和高度信息。该数据集包含17621个轨迹, 总距离为1251654公里, 总持续时间为48203小时。该轨迹数据集可以应用于移动模式挖掘、用户活动识别、基于位置的社交网络、位置隐私和位置推荐等多个研究领域。(Geolife GPS t
sklearn-tree-BN-knn
- 分类器的性能比较与调优: 使用scikit-learn 包中的tree,贝叶斯,knn,对数据进行模型训练,尽量了解其原理及运用。 使用不同分析三种分类器在实验中的性能比较,分析它们的特点。 本实验采用的数据集为house与segment。(Performance comparison and optimization of classifiers: We use tree, Bayesian and KNN in scikit-learnpackage to train the dat