小男孩‘自慰网亚洲一区二区,亚洲一级在线播放毛片,亚洲中文字幕av每天更新,黄aⅴ永久免费无码,91成人午夜在线精品,色网站免费在线观看,亚洲欧洲wwwww在线观看

分享

學(xué)好這些你就牛了,常用的機(jī)器學(xué)習(xí)&數(shù)據(jù)挖掘知識(shí)點(diǎn)

 haosunzhe 2015-01-04

Basis(基礎(chǔ)):


MSE(Mean Square Error 均方誤差),LMS(LeastMean Square 最小均方),LSM(Least Square Methods 最小二乘法),MLE(MaximumLikelihood Estimation最大似然估計(jì)),QP(Quadratic Programming 二次規(guī)劃), CP(Conditional Probability條件概率),JP(Joint Probability 聯(lián)合概率),MP(Marginal Probability邊緣概率),Bayesian Formula(貝葉斯公式),L1 /L2Regularization(L1/L2正則,以及更多的,現(xiàn)在比較火的L2.5正則等),GD(GradientDescent 梯度下降),SGD(Stochastic Gradient Descent 隨機(jī)梯度下降),Eigenvalue(特征值),Eigenvector(特征向量),QR-decomposition(QR分解),Quantile (分位數(shù)),Covariance(協(xié)方差矩陣)。


Common Distribution(常見(jiàn)分布):


Discrete Distribution(離散型分布):BernoulliDistribution/Binomial(貝努利分布/二項(xiàng)分布),Negative BinomialDistribution(負(fù)二項(xiàng)分布),MultinomialDistribution(多項(xiàng)式分布),Geometric Distribution(幾何分布),HypergeometricDistribution(超幾何分布),Poisson Distribution (泊松分布)


Continuous Distribution (連續(xù)型分布):UniformDistribution(均勻分布),Normal Distribution /Guassian Distribution(正態(tài)分布/高斯分布),ExponentialDistribution(指數(shù)分布),Lognormal Distribution(對(duì)數(shù)正態(tài)分布),GammaDistribution(Gamma分布),Beta Distribution(Beta分布),Dirichlet Distribution(狄利克雷分布),Rayleigh Distribution(瑞利分布),Cauchy Distribution(柯西分布),Weibull Distribution (韋伯分布)


Three Sampling Distribution(三大抽樣分布):Chi-squareDistribution(卡方分布),t-distribution(t-distribution),F(xiàn)-distribution(F-分布)


Data Pre-processing(數(shù)據(jù)預(yù)處理)


Missing Value Imputation(缺失值填充),Discretization(離散化),Mapping(映射),Normalization(歸一化/標(biāo)準(zhǔn)化)。


Sampling(采樣):


Simple Random Sampling(簡(jiǎn)單隨機(jī)采樣),OfflineSampling(離線等可能K采樣),Online Sampling(在線等可能K采樣),Ratio-based Sampling(等比例隨機(jī)采樣),Acceptance-RejectionSampling(接受-拒絕采樣),Importance Sampling(重要性采樣),MCMC(MarkovChain Monte Carlo 馬爾科夫蒙特卡羅采樣算法:Metropolis-Hasting& Gibbs)。


Clustering(聚類):


K-Means,K-Mediods,二分K-Means,F(xiàn)K-Means,Canopy,Spectral-KMeans(譜聚類),GMM-EM(混合高斯模型-期望最大化算法解決),K-Pototypes,CLARANS(基于劃分),BIRCH(基于層次),CURE(基于層次),DBSCAN(基于密度),CLIQUE(基于密度和基于網(wǎng)格)


Classification&Regression(分類&回歸):


LR(Linear Regression 線性回歸),LR(LogisticRegression邏輯回歸),SR(Softmax Regression 多分類邏輯回歸),GLM(GeneralizedLinear Model 廣義線性模型),RR(Ridge Regression 嶺回歸/L2正則最小二乘回歸),LASSO(Least Absolute Shrinkage andSelectionator Operator L1正則最小二乘回歸), RF(隨機(jī)森林),DT(DecisionTree決策樹(shù)),GBDT(Gradient BoostingDecision Tree 梯度下降決策樹(shù)),CART(ClassificationAnd Regression Tree 分類回歸樹(shù)),KNN(K-Nearest Neighbor K近鄰),SVM(Support VectorMachine),KF(KernelFunction 核函數(shù)PolynomialKernel Function 多項(xiàng)式核函數(shù)、Guassian KernelFunction 高斯核函數(shù)/Radial BasisFunction RBF徑向基函數(shù)、String KernelFunction 字符串核函數(shù))、 NB(Naive Bayes 樸素貝葉斯),BN(Bayesian Network/Bayesian Belief Network/ Belief Network 貝葉斯網(wǎng)絡(luò)/貝葉斯信度網(wǎng)絡(luò)/信念網(wǎng)絡(luò)),LDA(Linear Discriminant Analysis/FisherLinear Discriminant 線性判別分析/Fisher線性判別),EL(Ensemble Learning集成學(xué)習(xí)Boosting,Bagging,Stacking),AdaBoost(Adaptive Boosting 自適應(yīng)增強(qiáng)),MEM(MaximumEntropy Model最大熵模型)


Effectiveness Evaluation(分類效果評(píng)估):


Confusion Matrix(混淆矩陣),Precision(精確度),Recall(召回率),Accuracy(準(zhǔn)確率),F(xiàn)-score(F得分),ROC Curve(ROC曲線),AUC(AUC面積),LiftCurve(Lift曲線) ,KS Curve(KS曲線)。


PGM(Probabilistic Graphical Models概率圖模型):


BN(Bayesian Network/Bayesian Belief Network/ BeliefNetwork 貝葉斯網(wǎng)絡(luò)/貝葉斯信度網(wǎng)絡(luò)/信念網(wǎng)絡(luò)),MC(Markov Chain 馬爾科夫鏈),HMM(HiddenMarkov Model 馬爾科夫模型),MEMM(Maximum Entropy Markov Model 最大熵馬爾科夫模型),CRF(ConditionalRandom Field 條件隨機(jī)場(chǎng)),MRF(MarkovRandom Field 馬爾科夫隨機(jī)場(chǎng))。


NN(Neural Network神經(jīng)網(wǎng)絡(luò)):


ANN(Artificial Neural Network 人工神經(jīng)網(wǎng)絡(luò)),BP(Error BackPropagation 誤差反向傳播)


Deep Learning(深度學(xué)習(xí)):


Auto-encoder(自動(dòng)編碼器),SAE(Stacked Auto-encoders堆疊自動(dòng)編碼器:Sparse Auto-encoders稀疏自動(dòng)編碼器、Denoising Auto-encoders去噪自動(dòng)編碼器、Contractive Auto-encoders 收縮自動(dòng)編碼器),RBM(RestrictedBoltzmann Machine 受限玻爾茲曼機(jī)),DBN(Deep Belief Network 深度信念網(wǎng)絡(luò)),CNN(ConvolutionalNeural Network 卷積神經(jīng)網(wǎng)絡(luò)),Word2Vec(詞向量學(xué)習(xí)模型)。


DimensionalityReduction(降維):


LDA LinearDiscriminant Analysis/Fisher Linear Discriminant 線性判別分析/Fisher線性判別,PCA(Principal Component Analysis 主成分分析),ICA(IndependentComponent Analysis 獨(dú)立成分分析),SVD(Singular Value Decomposition 奇異值分解),F(xiàn)A(FactorAnalysis 因子分析法)。


Text Mining(文本挖掘):


VSM(Vector Space Model向量空間模型),Word2Vec(詞向量學(xué)習(xí)模型),TF(Term Frequency詞頻),TF-IDF(Term Frequency-Inverse DocumentFrequency 詞頻-逆向文檔頻率),MI(MutualInformation 互信息),ECE(Expected Cross Entropy 期望交叉熵),QEMI(二次信息熵),IG(InformationGain 信息增益),IGR(Information Gain Ratio 信息增益率),Gini(基尼系數(shù)),x2 Statistic(x2統(tǒng)計(jì)量),TEW(TextEvidence Weight文本證據(jù)權(quán)),OR(Odds Ratio 優(yōu)勢(shì)率),N-Gram Model,LSA(Latent Semantic Analysis 潛在語(yǔ)義分析),PLSA(ProbabilisticLatent Semantic Analysis 基于概率的潛在語(yǔ)義分析),LDA(Latent DirichletAllocation 潛在狄利克雷模型)


Association Mining(關(guān)聯(lián)挖掘):


Apriori,F(xiàn)P-growth(Frequency Pattern Tree Growth 頻繁模式樹(shù)生長(zhǎng)算法),AprioriAll,Spade。


Recommendation Engine(推薦引擎)


DBR(Demographic-based Recommendation 基于人口統(tǒng)計(jì)學(xué)的推薦),CBR(Context-basedRecommendation 基于內(nèi)容的推薦),CF(Collaborative Filtering協(xié)同過(guò)濾),UCF(User-basedCollaborative Filtering Recommendation 基于用戶的協(xié)同過(guò)濾推薦),ICF(Item-basedCollaborative Filtering Recommendation 基于項(xiàng)目的協(xié)同過(guò)濾推薦)。


Similarity Measure&Distance Measure(相似性與距離度量):


Euclidean Distance(歐式距離),ManhattanDistance(曼哈頓距離),Chebyshev Distance(切比雪夫距離),MinkowskiDistance(閔可夫斯基距離),Standardized Euclidean Distance(標(biāo)準(zhǔn)化歐氏距離),MahalanobisDistance(馬氏距離),Cos(Cosine 余弦),HammingDistance/Edit Distance(漢明距離/編輯距離),JaccardDistance(杰卡德距離),Correlation Coefficient Distance(相關(guān)系數(shù)距離),InformationEntropy(信息熵),KL(Kullback-Leibler Divergence KL散度/Relative Entropy 相對(duì)熵)。


Optimization(最優(yōu)化):


Non-constrainedOptimization(無(wú)約束優(yōu)化):Cyclic VariableMethods(變量輪換法),Pattern Search Methods(模式搜索法),VariableSimplex Methods(可變單純形法),Gradient Descent Methods(梯度下降法),Newton Methods(牛頓法),Quasi-NewtonMethods(擬牛頓法),Conjugate Gradient Methods(共軛梯度法)。


ConstrainedOptimization(有約束優(yōu)化):Approximation Programming Methods(近似規(guī)劃法),F(xiàn)easibleDirection Methods(可行方向法),Penalty Function Methods(罰函數(shù)法),Multiplier Methods(乘子法)。


Heuristic Algorithm(啟發(fā)式算法),SA(SimulatedAnnealing,模擬退火算法),GA(genetic algorithm遺傳算法)


Feature Selection(特征選擇算法):


Mutual Information(互信息),DocumentFrequence(文檔頻率),Information Gain(信息增益),Chi-squared Test(卡方檢驗(yàn)),Gini(基尼系數(shù))。


Outlier Detection(異常點(diǎn)檢測(cè)算法):


Statistic-based(基于統(tǒng)計(jì)),Distance-based(基于距離),Density-based(基于密度),Clustering-based(基于聚類)。


Learning to Rank(基于學(xué)習(xí)的排序):


Pointwise:McRank;


Pairwise:RankingSVM,RankNet,F(xiàn)rank,RankBoost;


Listwise:AdaRank,SoftRank,LamdaMART;


Tool(工具):

MPI,Hadoop生態(tài)圈,Spark,BSP,Weka,Mahout,Scikit-learn,PyBrain…


作者:尾巴子


End.

    本站是提供個(gè)人知識(shí)管理的網(wǎng)絡(luò)存儲(chǔ)空間,所有內(nèi)容均由用戶發(fā)布,不代表本站觀點(diǎn)。請(qǐng)注意甄別內(nèi)容中的聯(lián)系方式、誘導(dǎo)購(gòu)買等信息,謹(jǐn)防詐騙。如發(fā)現(xiàn)有害或侵權(quán)內(nèi)容,請(qǐng)點(diǎn)擊一鍵舉報(bào)。
    轉(zhuǎn)藏 分享 獻(xiàn)花(0

    0條評(píng)論

    發(fā)表

    請(qǐng)遵守用戶 評(píng)論公約

    類似文章 更多