鍍金池/ 教程/ 數(shù)據(jù)分析&挖掘/ R語言因子
R語言列表
R語言隨機(jī)森林
R語言矩陣
R語言邏輯回歸
R語言數(shù)據(jù)幀
R語言數(shù)據(jù)重塑
R語言概述
R語言包
R語言字符串
R語言CSV文件
R語言運(yùn)算符
為什么使用R語言做統(tǒng)計(jì)?
R語言Web數(shù)據(jù)
R語言二進(jìn)制文件
R語言XML文件
R語言JSON文件
R語言因子
R語言容易學(xué)習(xí)嗎?
R語言基礎(chǔ)語法
R語言向量
R語言教程
R語言正態(tài)分布
R語言平均值,中位數(shù)和眾數(shù)
R語言變量
R語言條形圖
R語言決策樹
R語言開發(fā)環(huán)境安裝配置
R語言數(shù)組
R語言數(shù)據(jù)類型
R語言非線性最小二乘法
R語言直方圖
R語言卡方檢驗(yàn)
R語言泊松回歸
R語言決策結(jié)構(gòu)
R語言盒形圖(箱形圖)
R語言協(xié)方差分析
R語言二項(xiàng)分布
R語言餅狀圖
R語言循環(huán)
R語言散點(diǎn)圖
R語言線性回歸
R語言時(shí)間序列分析
R語言線形圖
R語言在現(xiàn)實(shí)中的應(yīng)用
R語言生存分析
R語言多元(多重)回歸
R語言函數(shù)
R語言Excel文件
R語言連接數(shù)據(jù)庫(MySQL)

R語言因子

因子是用于對(duì)數(shù)據(jù)進(jìn)行分類并將其存儲(chǔ)為級(jí)別的數(shù)據(jù)對(duì)象。它們可以存儲(chǔ)字符串和整數(shù)。 它們?cè)诰哂杏邢迶?shù)量的唯一值的列中很有用。 像“男”,“女”,“真”,“假”等。它們?cè)诮y(tǒng)計(jì)建模的數(shù)據(jù)分析中很有用。

因子可通過factor()函數(shù)使用向量作為輸入來創(chuàng)建。

示例

# Create a vector as input.
data <- c("East","West","East","North","North","East","West","West","West","East","North")

print(data)
print(is.factor(data))

# Apply the factor function.
factor_data <- factor(data)

print(factor_data)
print(is.factor(factor_data))

當(dāng)我們執(zhí)行上述代碼時(shí),會(huì)產(chǎn)生以下結(jié)果 -

 [1] "East"  "West"  "East"  "North" "North" "East"  "West"  "West"  "West"  "East" "North"
[1] FALSE
 [1] East  West  East  North North East  West  West  West  East  North
Levels: East North West
[1] TRUE

在數(shù)據(jù)幀中的因子

在使用一列文本數(shù)據(jù)創(chuàng)建數(shù)據(jù)幀時(shí),R將文本列視為分類數(shù)據(jù)并在其上創(chuàng)建因子。參考以下示例代碼 -

# Create the vectors for data frame.
height <- c(132,151,162,139,166,147,122)
weight <- c(48,49,66,53,67,52,40)
gender <- c("male","male","female","female","male","female","male")

# Create the data frame.
input_data <- data.frame(height,weight,gender)
print(input_data)

# Test if the gender column is a factor.
print(is.factor(input_data$gender))

# Print the gender column so see the levels.
print(input_data$gender)

當(dāng)我們執(zhí)行上述代碼時(shí),會(huì)產(chǎn)生以下結(jié)果 -

  height weight gender
1    132     48   male
2    151     49   male
3    162     66 female
4    139     53 female
5    166     67   male
6    147     52 female
7    122     40   male
[1] TRUE
[1] male   male   female female male   female male  
Levels: female male

改變級(jí)別順序

可以通過用新的級(jí)別順序再次應(yīng)用因子函數(shù)來改變因子中級(jí)別的順序。參考以下實(shí)現(xiàn)代碼 -

data <- c("East","West","East","North","North","East","West","West","West","East","North")
# Create the factors
factor_data <- factor(data)
print(factor_data)

# Apply the factor function with required order of the level.
new_order_data <- factor(factor_data,levels = c("East","West","North"))
print(new_order_data)

當(dāng)我們執(zhí)行上述代碼時(shí),會(huì)產(chǎn)生以下結(jié)果 -

 [1] East  West  East  North North East  West  West  West  East  North
Levels: East North West
 [1] East  West  East  North North East  West  West  West  East  North
Levels: East West North

產(chǎn)生因子級(jí)別

可以通過使用gl()函數(shù)來生成因子級(jí)別。它需要兩個(gè)整數(shù)作為輸入,它表示每個(gè)級(jí)別有多少級(jí)別和多少次。

語法

gl(n, k, labels)

以下是使用的參數(shù)的描述 -

  • n - 是給出級(jí)別數(shù)的整數(shù)。
  • k - 是給出復(fù)制次數(shù)的整數(shù)。
  • labels - 是所得因子水平的標(biāo)簽向量。

例子

v <- gl(3, 4, labels = c("Tampa", "Seattle","Boston"))
print(v)

當(dāng)我們執(zhí)行上述代碼時(shí),會(huì)產(chǎn)生以下結(jié)果 -

Tampa   Tampa   Tampa   Tampa   Seattle Seattle Seattle Seattle Boston 
[10] Boston  Boston  Boston 
Levels: Tampa Seattle Boston

上一篇:R語言餅狀圖下一篇:R語言線性回歸