cluster analysis - basic clustering with r -
I am new to R and data analysis I am trying to create a simple custom recommendation system for a web site Therefore, as input information I have I have clips in that data I would like to implement the retailer's analysis. If I try to apply K cluster to my data: If I try to apply hierarchical clustering in the data: Am I wrong or wrong? Thank you in advance. === UPDATE # 1 === Output at String and Factor: === UPDATE # 2 === I do not want to remove the first column because there may be many clicks for the same user, which I consider important for analysis. It seems that you want to retain the first column (even if the 10062 level for 14634 comments Is quite high). The way to convert a factor into numerical values is with the after user / session-id, item-id, item-value on which users clicked
c165c2ee- 81cf-48cf-ba3f-83b70204c00c 161785 124.0 a886fdd5-7cee-4152-b1b7-77a2702687b0643339 42.0 5e5fd670-b104-445b-a36d-b3798cd43279 131332 38.0 888d736f-99bc-49ca-969d -057e7d4bb8d1 1032763 39.0
& gt; Q & lt; - Cummins (Data, Center = 25) Error _WAN (NMA): NA / NAN / ING In Foreign Function Call (AG1) In addition to this: Warning message: in Cummins (data, center = 25): By force of introduction of NAS
& gt; M & lt; - as.matrix (dat) & gt; D & lt; - (# , I have tried to run the code against
dat [-1] , but the result is the same.
& gt; Str (data) 'data.frame': 14634 of 3 variables: $ V3: factor w / 10062 level "000880bf-6cb7-4c4a-9a9d-1c0a 975b52ba", ..: 7548 6585 3670 5336 9181 6429 62 410 7386 940 9 ... $ V8: Factor w / 5561 level "1000120", "1000 910", ..: 835 39 9 44 443 65 1289 2084 582 695 3666 4787 ... $ V12: factor w / 395 level "100.0 "," 101.0 ", ..: 25 278 24 9 256 352 24 9 1 88 361 1 ... & gt; Dat [, 1] = factor (data [, 1]) & gt; str (dat) 'data.frame': 14634 of 3 variables: $ V3: factor w / 10062 level "000880bf-6cb7-4c4a-9a9d-1c0a 975b52ba", ..: 7548 6585 3670 5336 9181 6429 62 410 7386 940 9 ... $ V8: Factor w / 5561 level "1000120", "1000 910", ..: 835 39 9 44 443 65 1289 2084 582 695 3666 4787 ... $ V12: factor w / 395 level "100.0 "," 101.0 ", ..: 25 278 24 9 256 352 24 9 1 88 361 1 ... & gt; DD & Lt; - DIT warning message: In the district (nta): forcibly> HC & Lt; - Apply to HClLT (DD) #Hercyclical Clustering: Error in HCLLS: DD / NAN / FII in Foreign Function Call (AG11)
model.matrix function. Before you change your factor:
data (iris) head (iris) # sepel.land Seypell. Without Petal Long Collet Wide Species # 1 5.1 3.5 1.4 0.2 Setosa # 2 4.9 3.0 1.4 0.2 Setosa # 3 4.7 3.2 1.3 0.2 Setosa # 4 4.6 3.1 1.5 Settosa # 5 5.0 3.6 1.4 0.2 Setosa # 6 5.4 3. 9 1.7 0.4 Satosa
model.matrix :
head (model. Metrics (~. + 0, data = iris)) #Capell. Lamp Sepal.Width Petal.Length Petal.Width Speciessetosa Speciesversicolor Speciesvirginica # 1 5.1 3.5 1.4 0.2 1 0 0 # 2 4.9 3.0 1.4 0.2 1 0 0 # 3 4.7 3.2 1.3 0.2 1 0 0 # 4 4.6 3.1 1.5 0.2 1 0 0 # 5 5.0 3.6 1.4 0.2 1 0 0 # 6 5.4 3.9 1.7 0.4 1 0 As you can see, it expands your factor values. You can run k-ie clustering on an extended version of your data:
kmeans (model.matrix (~. 0, data = iris), center = 3) # K-shaped Clustering with 3 cluster means 49, 50, 51 # # Cluster means: # Sepal.Length Sepal.Width Petal.Length Petal.Width Speciessetosa Speciesversicolor Speciesvirginica # 1 6.622449 2.983673 5.573469 2.032653 0.0000000 1.00000000 # 2 5.006000 3.428000 1.462000 0.246000 1 0.0000000 0.00000000 # 3 5.915686 2.764706 4.264706 1.333333 0.9803922 0.01 9 60784 # ...
Comments
Post a Comment