Text Mining in Python


data 불러오기

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# ---- 데이터 불러오기 ----

library(ggplot2) # 시각화 코드
# install.packages("dplyr")
# install.packages("tidyr")
library(dplyr) # 데이터 가공
library(reshape) # 데이터 가공 <-- tidyr
library(readr) # 파일 입출력


raw_reviews = read_csv("data/Womens Clothing E-Commerce Reviews.csv") %>% select(-1)

# raw_reviews <- raw_reviews %>% select(-1)
glimpse(raw_reviews)

colnames(raw_reviews) <- c("ID", "Age", "Title", "Review", "Rating", "Recommend", "Liked", "Division", "Dept", "Class")

glimpse(raw_reviews)

# age 리뷰 작성한 고객의 연령
# Title, Review Text 리뷰 제목, 내용
# Rating: 고객이 부여한 평점
# Recommend IND: 추천 여부
# Positive Feedback Count: 좋아요 수치
# Division, Dept, Class --> 상품의 대분류 정보

data 전처리

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
# ---- 데이터 전처리 ----
# 결측치 확인
colSums(is.na(raw_reviews))

table(raw_reviews$Age)

age_group = cut(as.numeric(raw_reviews$Age),
breaks = seq(10, 100, by = 10),
include.lowest = TRUE,
right = FALSE,
labels = paste0(seq(10, 90, by = 10), "th"))

age_group[1:10]

# 새로운 변수 추가
raw_reviews$age_group = age_group
table(raw_reviews$age_group)

# 감성 사전 데이터셋 변환
summary(raw_reviews$Liked)
table(raw_reviews$Liked)

# 층화추출? / 임의추출
idx = sample(1:nrow(raw_reviews), nrow(raw_reviews) * 0.1, replace = FALSE)

raw_reviews2 = raw_reviews[idx, ]

raw_reviews2 %>%
mutate(pos_binary = ifelse(Liked > 0, 1, 0)) %>% # 이산형 변수로 변환
select(Liked, pos_binary) -> pos_binary_df

pos_binary_df$pos_binary <- as.factor(pos_binary_df$pos_binary)

table(pos_binary_df$pos_binary) # 0 부정, 1 긍정

# ---- 키워드 데이터셋 생성
REVIEW_TEXT = as.character(raw_reviews2$Review)
REVIEW_TEXT = tolower(raw_reviews2$Review)

# 단어를 이어 붙인 후, 토큰화된 단어들로 문장 재구성
library(tokenizers)

TEXT_Token = c()
for(i in 1:length(REVIEW_TEXT)) {
token_words = unlist(tokenize_word_stems(REVIEW_TEXT[i]))
Sentence = ""

for (tw in token_words) {
Sentence = paste(Sentence, tw)
}

TEXT_Token[i] = Sentence

}

Text 전처리

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
# ---- 텍스트 전처리
library(tm)

Corpus_token = Corpus(VectorSource(TEXT_Token))
Corpus_tm_token = tm_map(Corpus_token, removePunctuation)
Corpus_tm_token = tm_map(Corpus_token, removeNumbers)
Corpus_tm_token = tm_map(Corpus_token, removeWords, c(stopwords("English")))


#TDM과 DTM 의 차이 (TDM :term Document Matrix)
# T=ODF . DTM = CountVectprozor(in Python)
DTM_Token = DocumentTermMatrix(Corpus_tm_token)
DTM_Matrix_Token = as.matrix(DTM_Token)

# 상위 키워드 추출
# quantile() 함수 활용
top_1_pct = colSums(DTM_Matrix_Token) > quantile(colSums(DTM_Matrix_Token), probs = 0.99)

DTM_Matrix_Token_selected = DTM_Matrix_Token[, top_1_pct]

ncol(DTM_Matrix_Token_selected)

#Error
DTM_df = as.data.frame(DTM_Matrix_Token_selected)
DTM_df

pos_final_df = cbind(pos_binary_df, DTM_df)

glimpse(pos_final_df)


#희소행렬 문제가 나타나게 된다.

ncol(pos_final_df)

훈련, 검증용 data 분류

1
2
3
4
5
6
# ---- 훈련 검증용 데이터 분류 ----
set.seed(1234)
idx = sample(1:nrow(pos_final_df), nrow(pos_final_df) * 0.7, replace = FALSE)
train = pos_final_df[idx, ]
test = pos_final_df[-idx, ]

Logistic Regression Model Develop

1
2
3
4
5
6
7
8
9
10
11
# --- 로지스틱 회귀 모형 개발 ---

start_time = Sys.time()

glm_model = step(glm(pos_binary ~ .,
data = train[-1],
family = binomial(link = "logit")),
direction = "backward") # 후진소거법

End_time = Sys.time()
difftime(End_time, start_time, units = "secs")

Step: AIC=2202.56

  • Logistic regression 안의 평가 기준
  • 낮을 수록 좋다.

Step: AIC=2202.2
pos_binary ~ love + veri + just + size + dress + fit + will +
back + like + tri + flatter + top + length + realli + shirt +
materi

AIC_LogisticR

모형 성능 측정

1
2
3
4
5
6
# ---- 모형 성능 측정 ----
# install.packages("pROC")
library(pROC)
preds = predict(glm_model, newdata = test, type = "response")
roc_glm = roc(test$pos_binary, preds)
plot.roc(roc_glm, print.auc=TRUE)

R_calssification_pROC

정리

1. 정형 데이터 가져 오기 
2. 정형 데이터 가공
    - 좋아요 수를 활용하여 긍정/부정 data 나눔
3. 정형 데이터 분리 : 텍스트 데이터 따로 분리 
4. 텍스트 데이터 처리 (전처리, 토큰화, 코퍼스, DTM)
5. 텍스트 데이터 + 기존 data 합침
6. ML 모형 진행 (다른 모형을 진행 해도 된다. )

하지만, 혹시 지금까지 배운 내용이 너무 어렵다면 python으로만 하는 것도
나쁘지 않다.

Text Mining in R(02)

Text Mining in R (02)


앞선 내용 :

Text Mining in R (01)
library(KoNLP), useNIADic() 사용/설치


다음 내용 :

Text Mining in R (03)






§ MeCab 설치

Mecab-ko 형태소 분석기 사용 위해서는 Rcppmecab 패키지가 있어야함.

RcppMeCab install file URL:

해당 깃허브에서 설치해야 할 파일을 다운로드 받은 후,

RcppMeCab_zipfiles



  • 압축 해제 시에 C drive 에서 mecab folder 생성
  • 오른쪽 버튼 클릭 후 여기에압출풀기를 선택하면 쉽다.

  • 이 과정에서

    Rcppmecab

  • 위의 file내의 폴더 형태와, file 명, 경로 가 같지 않으면 다음과 같은 에러가 난다.

Exception:
list()


§ R 에서 설치

1
2
3
4
# library(remotes)
remotes::install_github("junhewk/RcppMeCab", force = TRUE)

library(RcppMeCab)

# library(remotes)
remotes::install_github(“junhewk/RcppMeCab”, force = TRUE)
Downloading GitHub repo junhewk/RcppMeCab@HEAD
Installing 2 packages: BH, RcppParallel
‘C:/Users/brill/Documents/R/win-library/4.1’의 위치에 패키지(들)을 설치합니다.
(왜냐하면 ‘lib’가 지정되지 않았기 때문입니다)
trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/BH_1.75.0-0.zip'
Content type ‘application/zip’ length 19675040 bytes (18.8 MB)
downloaded 18.8 MB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/RcppParallel_5.1.4.zip'
Content type ‘application/zip’ length 2140731 bytes (2.0 MB)
downloaded 2.0 MB

package ‘BH’ successfully unpacked and MD5 sums checked
package ‘RcppParallel’ successfully unpacked and MD5 sums checked

The downloaded binary packages are in
C:\Users\brill\AppData\Local\Temp\RtmpmuDZXg\downloaded_packages
√ checking for file ‘C:\Users\brill\AppData\Local\Temp\RtmpmuDZXg\remotes2cd0f4c5d4d\junhewk-RcppMeCab-e1800aa/DESCRIPTION’ (414ms)

  • preparing ‘RcppMeCab’: (373ms)
    √ checking DESCRIPTION meta-information …
  • cleaning src
  • checking for LF line-endings in source and make files and shell scripts
  • checking for empty or unneeded directories
    Omitted ‘LazyData’ from DESCRIPTION
  • building ‘RcppMeCab_0.0.1.3-2.tar.gz’

‘C:/Users/brill/Documents/R/win-library/4.1’의 위치에 패키지(들)을 설치합니다.
(왜냐하면 ‘lib’가 지정되지 않았기 때문입니다)

  • installing source package ‘RcppMeCab’ …
    • using staged installation
    • libs
      “C:/rtools40/mingw64/bin/“g++ -std=gnu++11 -I”C:/PROGRA1/R/R-411.2/include” -DNDEBUG -I../inst/include -DBOOST_NO_AUTO_PTR -I’C:/Users/brill/Documents/R/win-library/4.1/Rcpp/include’ -I’C:/Users/brill/Documents/R/win-library/4.1/RcppParallel/include’ -I’C:/Users/brill/Documents/R/win-library/4.1/BH/include’ -DRCPP_PARALLEL_USE_TBB=1 -DDLL_IMPORT -DSTRICT_R_HEADERS -Wno-parentheses -O2 -Wall -mfpmath=sse -msse2 -mstackrealign -c RcppExports.cpp -o RcppExports.o
      “C:/rtools40/mingw64/bin/“g++ -std=gnu++11 -I”C:/PROGRA1/R/R-411.2/include” -DNDEBUG -I../inst/include -DBOOST_NO_AUTO_PTR -I’C:/Users/brill/Documents/R/win-library/4.1/Rcpp/include’ -I’C:/Users/brill/Documents/R/win-library/4.1/RcppParallel/include’ -I’C:/Users/brill/Documents/R/win-library/4.1/BH/include’ -DRCPP_PARALLEL_USE_TBB=1 -DDLL_IMPORT -DSTRICT_R_HEADERS -Wno-parentheses -O2 -Wall -mfpmath=sse -msse2 -mstackrealign -c posParallelRcpp.cpp -o posParallelRcpp.o
      “C:/rtools40/mingw64/bin/“g++ -std=gnu++11 -I”C:/PROGRA1/R/R-411.2/include” -DNDEBUG -I../inst/include -DBOOST_NO_AUTO_PTR -I’C:/Users/brill/Documents/R/win-library/4.1/Rcpp/include’ -I’C:/Users/brill/Documents/R/win-library/4.1/RcppParallel/include’ -I’C:/Users/brill/Documents/R/win-library/4.1/BH/include’ -DRCPP_PARALLEL_USE_TBB=1 -DDLL_IMPORT -DSTRICT_R_HEADERS -Wno-parentheses -O2 -Wall -mfpmath=sse -msse2 -mstackrealign -c posRcpp.cpp -o posRcpp.o
      “C:/rtools40/mingw64/bin/“g++ -std=gnu++11 -I”C:/PROGRA1/R/R-411.2/include” -DNDEBUG -I../inst/include -DBOOST_NO_AUTO_PTR -I’C:/Users/brill/Documents/R/win-library/4.1/Rcpp/include’ -I’C:/Users/brill/Documents/R/win-library/4.1/RcppParallel/include’ -I’C:/Users/brill/Documents/R/win-library/4.1/BH/include’ -DRCPP_PARALLEL_USE_TBB=1 -DDLL_IMPORT -DSTRICT_R_HEADERS -Wno-parentheses -O2 -Wall -mfpmath=sse -msse2 -mstackrealign -c posloopRcpp.cpp -o posloopRcpp.o
      C:/rtools40/mingw64/bin/g++ -shared -s -static-libgcc -o RcppMeCab.dll tmp.def RcppExports.o posParallelRcpp.o posRcpp.o posloopRcpp.o -L../inst/libs/x64 -LC:/Users/brill/Documents/R/win-library/4.1/RcppParallel/lib/x64 -ltbb -ltbbmalloc -lm -llibmecab -LC:/PROGRA1/R/R-411.2/bin/x64 -lR
      installing to C:/Users/brill/Documents/R/win-library/4.1/00LOCK-RcppMeCab/00new/RcppMeCab/libs/x64
    • R
    • inst
    • byte-compile and prepare package for lazy loading
    • help
  • ** installing help indices
    converting help for package ‘RcppMeCab’
    finding HTML links … done
    RcppMeCab html
    pos html
    posParallel html
    • building package indices
    • testing if installed package can be loaded from temporary location
    • testing if installed package can be loaded from final location
    • testing if installed package keeps a record of temporary installation path
  • DONE (RcppMeCab)

RcppMeCab 설치 확인 (형태소 분리기)

text 1에 한글을 써 본다.

1
2
text1 = "안녕하세요?!"
pos(sentence = text1)

text1 = “안녕하세요?!”
pos(sentence = text1)
$�ȳ\xe7\xc7ϼ��\xe4?!
[1] “�/SY” “ȳ/SL” “\xe7\xc7\xcf/SH”
[4] “���/SY” “\xe4?!/SH”



- 인코딩이 UTF-8로 되어 있지 안아서 생기는 문제이다.
1
2
text2 = enc2utf8(text1)
pos(sentence = text2)

text2 = enc2utf8(text1)

pos(sentence = text2)

$안녕하세요?!

[1] “안녕/NNG” “하/XSV” “세요/EP+EF” “?/SF” “!/SF”

강사님 도움 받기
강사님 강의 듣기


페이가 안맞아서 그런가 우리 수업에서는 이렇게 안해준다.
못가르치는 것이 아니라 안가르치는 것이어서 화가 나지만, 각자의 사정이 있는것이겠지.
나도 국비 과정 들으면서 너무 많은 것을 바란건 아닌지 생각 해 본다.

설치/ 확인 끝

Text Mining in R(01)

R을 이용한 TextMining





** R Install에 관한 내용은
여기 있다.

빅카인즈 (Korea)




감정분석

  • 댓글에서 부정/ 긍정 에 대해 확인

R 환경 설정

1
2
3
4
5
6
7
8
install.packaged("multilinguer")
#위에 Install이 안되면, 아래 것으로 설치

install.packages("remotes")
remotes::install_github("mrchypark/multilinguer")

install_jdk()
#자바 설치가 자동으로 path 설정 까지 될 수 있도록 해줌

package ‘rJava’ successfully unpacked and MD5 sums checked



R-tool 설치 (path 설정)

  • 이미 R-tool 이 설치가 되어있다면, Pass
  • R-tool 설치 후
  • 아래 코드를 실행 한 후 R Studio program 종료후 재시작
1
2
3
write('PATH="${RTOOLS40_HOME}\\usr\\bin;${PATH}"', 
file = "~/.Renviron", append = TRUE)
Sys.which("make")

write(‘PATH=”${RTOOLS40_HOME}\usr\bin;${PATH}”‘,

file = “~/.Renviron”, append = TRUE)

Sys.which(“make”)

………………………………………make

“C:\rtools40\usr\bin\make.exe”

jsonlite install

1
install.packages("jsonlite", type = "source")

install.packages(“jsonlite”, type = “source”)
‘C:/Users/brill/Documents/R/win-library/4.1’의 위치에 패키지(들)을 설치합니다.
(왜냐하면 ‘lib’가 지정되지 않았기 때문입니다)
trying URL ‘https://cran.rstudio.com/src/contrib/jsonlite_1.7.2.tar.gz'
Content type ‘application/x-gzip’ length 421716 bytes (411 KB)
downloaded 411 KB

  • installing source package ‘jsonlite’ …
    • package ‘jsonlite’ successfully unpacked and MD5 sums checked
    • using staged installation
    • libs
      “C:/rtools40/mingw64/bin/“gcc -I”C:/PROGRA1/R/R-411.2/include” -DNDEBUG -Iyajl/api -D__USE_MINGW_ANSI_STDIO -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c base64.c -o base64.o
      “C:/rtools40/mingw64/bin/“gcc -I”C:/PROGRA1/R/R-411.2/include” -DNDEBUG -Iyajl/api -D__USE_MINGW_ANSI_STDIO -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c collapse_array.c -o collapse_array.o
      “C:/rtools40/mingw64/bin/“gcc -I”C:/PROGRA1/R/R-411.2/include” -DNDEBUG -Iyajl/api -D__USE_MINGW_ANSI_STDIO -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c collapse_object.c -o collapse_object.o
      “C:/rtools40/mingw64/bin/“gcc -I”C:/PROGRA1/R/R-411.2/include” -DNDEBUG -Iyajl/api -D__USE_MINGW_ANSI_STDIO -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c collapse_pretty.c -o collapse_pretty.o
      “C:/rtools40/mingw64/bin/“gcc -I”C:/PROGRA1/R/R-411.2/include” -DNDEBUG -Iyajl/api -D__USE_MINGW_ANSI_STDIO -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c escape_chars.c -o escape_chars.o
      “C:/rtools40/mingw64/bin/“gcc -I”C:/PROGRA1/R/R-411.2/include” -DNDEBUG -Iyajl/api -D__USE_MINGW_ANSI_STDIO -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c integer64_to_na.c -o integer64_to_na.o
      “C:/rtools40/mingw64/bin/“gcc -I”C:/PROGRA1/R/R-411.2/include” -DNDEBUG -Iyajl/api -D__USE_MINGW_ANSI_STDIO -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c is_datelist.c -o is_datelist.o
      “C:/rtools40/mingw64/bin/“gcc -I”C:/PROGRA1/R/R-411.2/include” -DNDEBUG -Iyajl/api -D__USE_MINGW_ANSI_STDIO -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c is_recordlist.c -o is_recordlist.o
      “C:/rtools40/mingw64/bin/“gcc -I”C:/PROGRA1/R/R-411.2/include” -DNDEBUG -Iyajl/api -D__USE_MINGW_ANSI_STDIO -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c is_scalarlist.c -o is_scalarlist.o
      “C:/rtools40/mingw64/bin/“gcc -I”C:/PROGRA1/R/R-411.2/include” -DNDEBUG -Iyajl/api -D__USE_MINGW_ANSI_STDIO -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c modp_numtoa.c -o modp_numtoa.o
      “C:/rtools40/mingw64/bin/“gcc -I”C:/PROGRA1/R/R-411.2/include” -DNDEBUG -Iyajl/api -D__USE_MINGW_ANSI_STDIO -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c null_to_na.c -o null_to_na.o
      “C:/rtools40/mingw64/bin/“gcc -I”C:/PROGRA1/R/R-411.2/include” -DNDEBUG -Iyajl/api -D__USE_MINGW_ANSI_STDIO -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c num_to_char.c -o num_to_char.o
      “C:/rtools40/mingw64/bin/“gcc -I”C:/PROGRA1/R/R-411.2/include” -DNDEBUG -Iyajl/api -D__USE_MINGW_ANSI_STDIO -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c parse.c -o parse.o
      “C:/rtools40/mingw64/bin/“gcc -I”C:/PROGRA1/R/R-411.2/include” -DNDEBUG -Iyajl/api -D__USE_MINGW_ANSI_STDIO -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c prettify.c -o prettify.o
      “C:/rtools40/mingw64/bin/“gcc -I”C:/PROGRA1/R/R-411.2/include” -DNDEBUG -Iyajl/api -D__USE_MINGW_ANSI_STDIO -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c push_parser.c -o push_parser.o
      “C:/rtools40/mingw64/bin/“gcc -I”C:/PROGRA1/R/R-411.2/include” -DNDEBUG -Iyajl/api -D__USE_MINGW_ANSI_STDIO -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c r-base64.c -o r-base64.o
      “C:/rtools40/mingw64/bin/“gcc -I”C:/PROGRA1/R/R-411.2/include” -DNDEBUG -Iyajl/api -D__USE_MINGW_ANSI_STDIO -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c register.c -o register.o
      “C:/rtools40/mingw64/bin/“gcc -I”C:/PROGRA1/R/R-411.2/include” -DNDEBUG -Iyajl/api -D__USE_MINGW_ANSI_STDIO -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c row_collapse.c -o row_collapse.o
      “C:/rtools40/mingw64/bin/“gcc -I”C:/PROGRA1/R/R-411.2/include” -DNDEBUG -Iyajl/api -D__USE_MINGW_ANSI_STDIO -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c transpose_list.c -o transpose_list.o
      “C:/rtools40/mingw64/bin/“gcc -I”C:/PROGRA1/R/R-411.2/include” -DNDEBUG -Iyajl/api -D__USE_MINGW_ANSI_STDIO -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c validate.c -o validate.o
      “C:/rtools40/mingw64/bin/“gcc -I”C:/PROGRA1/R/R-411.2/include” -DNDEBUG -Iyajl/api -D__USE_MINGW_ANSI_STDIO -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c yajl/yajl.c -o yajl/yajl.o
      “C:/rtools40/mingw64/bin/“gcc -I”C:/PROGRA1/R/R-411.2/include” -DNDEBUG -Iyajl/api -D__USE_MINGW_ANSI_STDIO -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c yajl/yajl_alloc.c -o yajl/yajl_alloc.o
      “C:/rtools40/mingw64/bin/“gcc -I”C:/PROGRA1/R/R-411.2/include” -DNDEBUG -Iyajl/api -D__USE_MINGW_ANSI_STDIO -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c yajl/yajl_buf.c -o yajl/yajl_buf.o
      “C:/rtools40/mingw64/bin/“gcc -I”C:/PROGRA1/R/R-411.2/include” -DNDEBUG -Iyajl/api -D__USE_MINGW_ANSI_STDIO -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c yajl/yajl_encode.c -o yajl/yajl_encode.o
      “C:/rtools40/mingw64/bin/“gcc -I”C:/PROGRA1/R/R-411.2/include” -DNDEBUG -Iyajl/api -D__USE_MINGW_ANSI_STDIO -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c yajl/yajl_gen.c -o yajl/yajl_gen.o
      “C:/rtools40/mingw64/bin/“gcc -I”C:/PROGRA1/R/R-411.2/include” -DNDEBUG -Iyajl/api -D__USE_MINGW_ANSI_STDIO -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c yajl/yajl_lex.c -o yajl/yajl_lex.o
      “C:/rtools40/mingw64/bin/“gcc -I”C:/PROGRA1/R/R-411.2/include” -DNDEBUG -Iyajl/api -D__USE_MINGW_ANSI_STDIO -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c yajl/yajl_parser.c -o yajl/yajl_parser.o
      “C:/rtools40/mingw64/bin/“gcc -I”C:/PROGRA1/R/R-411.2/include” -DNDEBUG -Iyajl/api -D__USE_MINGW_ANSI_STDIO -O2 -Wall -std=gnu99 -mfpmath=sse -msse2 -mstackrealign -c yajl/yajl_tree.c -o yajl/yajl_tree.o
      “C:/rtools40/mingw64/bin/“ar rcs yajl/libstatyajl.a yajl/yajl.o yajl/yajl_alloc.o yajl/yajl_buf.o yajl/yajl_encode.o yajl/yajl_gen.o yajl/yajl_lex.o yajl/yajl_parser.o yajl/yajl_tree.o
      C:/rtools40/mingw64/bin/gcc -shared -s -static-libgcc -o jsonlite.dll tmp.def base64.o collapse_array.o collapse_object.o collapse_pretty.o escape_chars.o integer64_to_na.o is_datelist.o is_recordlist.o is_scalarlist.o modp_numtoa.o null_to_na.o num_to_char.o parse.o prettify.o push_parser.o r-base64.o register.o row_collapse.o transpose_list.o validate.o -Lyajl -lstatyajl -LC:/PROGRA1/R/R-411.2/bin/x64 -lR
      installing to C:/Users/brill/Documents/R/win-library/4.1/00LOCK-jsonlite/00new/jsonlite/libs/x64
    • R
    • inst
    • byte-compile and prepare package for lazy loading
      in method for ‘asJSON’ with signature ‘“blob”‘: no definition for class “blob”
    • help
  • ** installing help indices
    converting help for package ‘jsonlite’
    finding HTML links … done
    base64 html
    flatten html
    fromJSON html
    prettify html
    rbind_pages html
    read_json html
    serializeJSON html
    stream_in html
    unbox html
    validate html
    • building package indices
    • installing vignettes
    • testing if installed package can be loaded from temporary location
    • testing if installed package can be loaded from final location
    • testing if installed package keeps a record of temporary installation path
  • DONE (jsonlite)




R packages 설치

1
2
install.packages(c("stringr", "hash", "tau", "Sejong", "RSQLite", "devtools"),
type = "binary")


The downloaded source packages are in
‘C:\Users\brill\AppData\Local\Temp\RtmpmuDZXg\downloaded_packages’
install.packages(c(“stringr”, “hash”, “tau”, “Sejong”, “RSQLite”, “devtools”),

  •              type = "binary")
    
    ‘C:/Users/brill/Documents/R/win-library/4.1’의 위치에 패키지(들)을 설치합니다.
    (왜냐하면 ‘lib’가 지정되지 않았기 때문입니다)
    ‘fastmap’, ‘highr’, ‘xfun’, ‘diffobj’, ‘rematch2’, ‘bit’, ‘cachem’, ‘processx’, ‘prettyunits’, ‘digest’, ‘xopen’, ‘brew’, ‘commonmark’, ‘knitr’, ‘cpp11’, ‘brio’, ‘evaluate’, ‘praise’, ‘ps’, ‘waldo’, ‘bit64’, ‘blob’, ‘DBI’, ‘memoise’, ‘Rcpp’, ‘plogr’, ‘callr’, ‘pkgbuild’, ‘pkgload’, ‘rcmdcheck’, ‘roxygen2’, ‘rversions’, ‘sessioninfo’, ‘testthat’(들)을 또한 설치합니다.

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/fastmap_1.1.0.zip'
Content type ‘application/zip’ length 215381 bytes (210 KB)
downloaded 210 KB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/highr_0.9.zip'
Content type ‘application/zip’ length 46725 bytes (45 KB)
downloaded 45 KB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/xfun_0.28.zip'
Content type ‘application/zip’ length 386111 bytes (377 KB)
downloaded 377 KB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/diffobj_0.3.5.zip'
Content type ‘application/zip’ length 999001 bytes (975 KB)
downloaded 975 KB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/rematch2_2.1.2.zip'
Content type ‘application/zip’ length 47584 bytes (46 KB)
downloaded 46 KB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/bit_4.0.4.zip'
Content type ‘application/zip’ length 635254 bytes (620 KB)
downloaded 620 KB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/cachem_1.0.6.zip'
Content type ‘application/zip’ length 79002 bytes (77 KB)
downloaded 77 KB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/processx_3.5.2.zip'
Content type ‘application/zip’ length 1246508 bytes (1.2 MB)
downloaded 1.2 MB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/prettyunits_1.1.1.zip'
Content type ‘application/zip’ length 37755 bytes (36 KB)
downloaded 36 KB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/digest_0.6.29.zip'
Content type ‘application/zip’ length 266591 bytes (260 KB)
downloaded 260 KB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/xopen_1.0.0.zip'
Content type ‘application/zip’ length 24785 bytes (24 KB)
downloaded 24 KB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/brew_1.0-6.zip'
Content type ‘application/zip’ length 113926 bytes (111 KB)
downloaded 111 KB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/commonmark_1.7.zip'
Content type ‘application/zip’ length 265490 bytes (259 KB)
downloaded 259 KB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/knitr_1.36.zip'
Content type ‘application/zip’ length 1469306 bytes (1.4 MB)
downloaded 1.4 MB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/cpp11_0.4.2.zip'
Content type ‘application/zip’ length 327396 bytes (319 KB)
downloaded 319 KB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/brio_1.1.3.zip'
Content type ‘application/zip’ length 48880 bytes (47 KB)
downloaded 47 KB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/evaluate_0.14.zip'
Content type ‘application/zip’ length 76790 bytes (74 KB)
downloaded 74 KB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/praise_1.0.0.zip'
Content type ‘application/zip’ length 19849 bytes (19 KB)
downloaded 19 KB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/ps_1.6.0.zip'
Content type ‘application/zip’ length 775912 bytes (757 KB)
downloaded 757 KB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/waldo_0.3.1.zip'
Content type ‘application/zip’ length 96434 bytes (94 KB)
downloaded 94 KB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/bit64_4.0.5.zip'
Content type ‘application/zip’ length 565517 bytes (552 KB)
downloaded 552 KB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/blob_1.2.2.zip'
Content type ‘application/zip’ length 48321 bytes (47 KB)
downloaded 47 KB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/DBI_1.1.1.zip'
Content type ‘application/zip’ length 686681 bytes (670 KB)
downloaded 670 KB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/memoise_2.0.1.zip'
Content type ‘application/zip’ length 50131 bytes (48 KB)
downloaded 48 KB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/Rcpp_1.0.7.zip'
Content type ‘application/zip’ length 3263462 bytes (3.1 MB)
downloaded 3.1 MB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/plogr_0.2.0.zip'
Content type ‘application/zip’ length 18943 bytes (18 KB)
downloaded 18 KB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/callr_3.7.0.zip'
Content type ‘application/zip’ length 437774 bytes (427 KB)
downloaded 427 KB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/pkgbuild_1.3.0.zip'
Content type ‘application/zip’ length 146266 bytes (142 KB)
downloaded 142 KB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/pkgload_1.2.4.zip'
Content type ‘application/zip’ length 156265 bytes (152 KB)
downloaded 152 KB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/rcmdcheck_1.4.0.zip'
Content type ‘application/zip’ length 170257 bytes (166 KB)
downloaded 166 KB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/roxygen2_7.1.2.zip'
Content type ‘application/zip’ length 1352846 bytes (1.3 MB)
downloaded 1.3 MB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/rversions_2.1.1.zip'
Content type ‘application/zip’ length 67399 bytes (65 KB)
downloaded 65 KB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/sessioninfo_1.2.2.zip'
Content type ‘application/zip’ length 186234 bytes (181 KB)
downloaded 181 KB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/testthat_3.1.1.zip'
Content type ‘application/zip’ length 2545637 bytes (2.4 MB)
downloaded 2.4 MB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/stringr_1.4.0.zip'
Content type ‘application/zip’ length 216715 bytes (211 KB)
downloaded 211 KB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/hash_2.2.6.1.zip'
Content type ‘application/zip’ length 178061 bytes (173 KB)
downloaded 173 KB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/tau_0.0-24.zip'
Content type ‘application/zip’ length 186662 bytes (182 KB)
downloaded 182 KB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/Sejong_0.01.zip'
Content type ‘application/zip’ length 1617954 bytes (1.5 MB)
downloaded 1.5 MB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/RSQLite_2.2.9.zip'
Content type ‘application/zip’ length 2511267 bytes (2.4 MB)
downloaded 2.4 MB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/devtools_2.4.3.zip'
Content type ‘application/zip’ length 423398 bytes (413 KB)
downloaded 413 KB

package ‘fastmap’ successfully unpacked and MD5 sums checked
package ‘highr’ successfully unpacked and MD5 sums checked
package ‘xfun’ successfully unpacked and MD5 sums checked
package ‘diffobj’ successfully unpacked and MD5 sums checked
package ‘rematch2’ successfully unpacked and MD5 sums checked
package ‘bit’ successfully unpacked and MD5 sums checked
package ‘cachem’ successfully unpacked and MD5 sums checked
package ‘processx’ successfully unpacked and MD5 sums checked
package ‘prettyunits’ successfully unpacked and MD5 sums checked
package ‘digest’ successfully unpacked and MD5 sums checked
package ‘xopen’ successfully unpacked and MD5 sums checked
package ‘brew’ successfully unpacked and MD5 sums checked
package ‘commonmark’ successfully unpacked and MD5 sums checked
package ‘knitr’ successfully unpacked and MD5 sums checked
package ‘cpp11’ successfully unpacked and MD5 sums checked
package ‘brio’ successfully unpacked and MD5 sums checked
package ‘evaluate’ successfully unpacked and MD5 sums checked
package ‘praise’ successfully unpacked and MD5 sums checked
package ‘ps’ successfully unpacked and MD5 sums checked
package ‘waldo’ successfully unpacked and MD5 sums checked
package ‘bit64’ successfully unpacked and MD5 sums checked
package ‘blob’ successfully unpacked and MD5 sums checked
package ‘DBI’ successfully unpacked and MD5 sums checked
package ‘memoise’ successfully unpacked and MD5 sums checked
package ‘Rcpp’ successfully unpacked and MD5 sums checked
package ‘plogr’ successfully unpacked and MD5 sums checked
package ‘callr’ successfully unpacked and MD5 sums checked
package ‘pkgbuild’ successfully unpacked and MD5 sums checked
package ‘pkgload’ successfully unpacked and MD5 sums checked
package ‘rcmdcheck’ successfully unpacked and MD5 sums checked
package ‘roxygen2’ successfully unpacked and MD5 sums checked
package ‘rversions’ successfully unpacked and MD5 sums checked
package ‘sessioninfo’ successfully unpacked and MD5 sums checked
package ‘testthat’ successfully unpacked and MD5 sums checked
package ‘stringr’ successfully unpacked and MD5 sums checked
package ‘hash’ successfully unpacked and MD5 sums checked
package ‘tau’ successfully unpacked and MD5 sums checked
package ‘Sejong’ successfully unpacked and MD5 sums checked
package ‘RSQLite’ successfully unpacked and MD5 sums checked
package ‘devtools’ successfully unpacked and MD5 sums checked





명사 분리기(KoNLP) 설치를 위한 remotes packages 설치 (in R)

1
2
3
4
5
# install.packages("remotes")
remotes::install_github("haven-jeon/KoNLP",
upgrade = "never",
force = TRUE,
INSTALL_opts = c("--no-multiarch"))


# install.packages(“remotes”)
remotes::install_github(“haven-jeon/KoNLP”,

  •                     upgrade = "never",
    
  •                     force = TRUE,
    
  •                     INSTALL_opts = c("--no-multiarch"))
    
    Downloading GitHub repo haven-jeon/KoNLP@HEAD
    √ checking for file ‘C:\Users\brill\AppData\Local\Temp\RtmpmuDZXg\remotes2cd03d177e06\haven-jeon-KoNLP-960fbbc/DESCRIPTION’ …
  • preparing ‘KoNLP’: (722ms)
    √ checking DESCRIPTION meta-information …
  • checking for LF line-endings in source and make files and shell scripts
  • checking for empty or unneeded directories
  • looking to see if a ‘data/datalist’ file should be added
  • building ‘KoNLP_0.80.2.tar.gz’

‘C:/Users/brill/Documents/R/win-library/4.1’의 위치에 패키지(들)을 설치합니다.
(왜냐하면 ‘lib’가 지정되지 않았기 때문입니다)

  • installing source package ‘KoNLP’ …
    • using staged installation
    • R
    • data
    • inst
    • byte-compile and prepare package for lazy loading
    • help
  • ** installing help indices
    converting help for package ‘KoNLP’
    finding HTML links … done
    HangulAutomata html
    KtoS html
    MorphAnalyzer html
    SimplePos09 html
    SimplePos22 html
    StoK html
    backupUsrDic html
    buildDictionary html
    concordance_file html
    concordance_str html
    convertHangulStringToJamos html
    convertHangulStringToKeyStrokes html
    convertTag html
    editweights html
    extractNoun html
    get_dictionary html
    is.ascii html
    is.hangul html
    is.jaeum html
    is.jamo html
    is.moeum html
    mergeUserDic html
    mutualinformation html
    reloadAllDic html
    reloadUserDic html
    restoreUsrDic html
    scala_library_install html
    statDic html
    tags html
    useNIADic html
    useSejongDic html
    useSystemDic html
    • building package indices
    • installing vignettes

[1] TRUE
[1] 5744974
Successfully installed Scala runtime library in C:/Users/brill/Documents/R/win-library/4.1/00LOCK-KoNLP/00new/KoNLP/java/scala-library-2.11.8.jar
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path

  • DONE (KoNLP)



명사 분리기 KoNLP 설치

1
2
library(KoNLP)
useNIADic()


library(KoNLP)
useNIADic()
Backup was just finished!
Downloading package from url: https://github.com/haven-jeon/NIADic/releases/download/0.0.1/NIADic_0.0.1.tar.gz
Installing 16 packages: colorspace, viridisLite, RColorBrewer, munsell, labeling, farver, base64enc, htmltools, scales, isoband, gtable, jquerylib, tinytex, ggplot2, data.table, rmarkdown
‘C:/Users/brill/Documents/R/win-library/4.1’의 위치에 패키지(들)을 설치합니다.
(왜냐하면 ‘lib’가 지정되지 않았기 때문입니다)
trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/colorspace_2.0-2.zip'
Content type ‘application/zip’ length 2645307 bytes (2.5 MB)
downloaded 2.5 MB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/viridisLite_0.4.0.zip'
Content type ‘application/zip’ length 1299504 bytes (1.2 MB)
downloaded 1.2 MB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/RColorBrewer_1.1-2.zip'
Content type ‘application/zip’ length 55707 bytes (54 KB)
downloaded 54 KB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/munsell_0.5.0.zip'
Content type ‘application/zip’ length 245486 bytes (239 KB)
downloaded 239 KB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/labeling_0.4.2.zip'
Content type ‘application/zip’ length 62679 bytes (61 KB)
downloaded 61 KB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/farver_2.1.0.zip'
Content type ‘application/zip’ length 1752621 bytes (1.7 MB)
downloaded 1.7 MB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/base64enc_0.1-3.zip'
Content type ‘application/zip’ length 43156 bytes (42 KB)
downloaded 42 KB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/htmltools_0.5.2.zip'
Content type ‘application/zip’ length 347310 bytes (339 KB)
downloaded 339 KB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/scales_1.1.1.zip'
Content type ‘application/zip’ length 558895 bytes (545 KB)
downloaded 545 KB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/isoband_0.2.5.zip'
Content type ‘application/zip’ length 2726764 bytes (2.6 MB)
downloaded 2.6 MB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/gtable_0.3.0.zip'
Content type ‘application/zip’ length 434327 bytes (424 KB)
downloaded 424 KB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/jquerylib_0.1.4.zip'
Content type ‘application/zip’ length 525848 bytes (513 KB)
downloaded 513 KB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/tinytex_0.35.zip'
Content type ‘application/zip’ length 126495 bytes (123 KB)
downloaded 123 KB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/ggplot2_3.3.5.zip'
Content type ‘application/zip’ length 4130301 bytes (3.9 MB)
downloaded 3.9 MB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/data.table_1.14.2.zip'
Content type ‘application/zip’ length 2600846 bytes (2.5 MB)
downloaded 2.5 MB

trying URL ‘https://cran.rstudio.com/bin/windows/contrib/4.1/rmarkdown_2.11.zip'
Content type ‘application/zip’ length 3660449 bytes (3.5 MB)
downloaded 3.5 MB

package ‘colorspace’ successfully unpacked and MD5 sums checked
package ‘viridisLite’ successfully unpacked and MD5 sums checked
package ‘RColorBrewer’ successfully unpacked and MD5 sums checked
package ‘munsell’ successfully unpacked and MD5 sums checked
package ‘labeling’ successfully unpacked and MD5 sums checked
package ‘farver’ successfully unpacked and MD5 sums checked
package ‘base64enc’ successfully unpacked and MD5 sums checked
package ‘htmltools’ successfully unpacked and MD5 sums checked
package ‘scales’ successfully unpacked and MD5 sums checked
package ‘isoband’ successfully unpacked and MD5 sums checked
package ‘gtable’ successfully unpacked and MD5 sums checked
package ‘jquerylib’ successfully unpacked and MD5 sums checked
package ‘tinytex’ successfully unpacked and MD5 sums checked
package ‘ggplot2’ successfully unpacked and MD5 sums checked
package ‘data.table’ successfully unpacked and MD5 sums checked
package ‘rmarkdown’ successfully unpacked and MD5 sums checked

The downloaded binary packages are in
C:\Users\brill\AppData\Local\Temp\RtmpmuDZXg\downloaded_packages
√ checking for file ‘C:\Users\brill\AppData\Local\Temp\RtmpmuDZXg\remotes2cd0437ea43\NIADic/DESCRIPTION’ …

  • preparing ‘NIADic’:
    √ checking DESCRIPTION meta-information …
    √ checking vignette meta-information …
  • checking for LF line-endings in source and make files and shell scripts
  • checking for empty or unneeded directories
  • building ‘NIADic_0.0.1.tar.gz’

‘C:/Users/brill/Documents/R/win-library/4.1’의 위치에 패키지(들)을 설치합니다.
(왜냐하면 ‘lib’가 지정되지 않았기 때문입니다)

  • installing source package ‘NIADic’ …
    • using staged installation
    • R
    • inst
    • byte-compile and prepare package for lazy loading
    • help
  • ** installing help indices
    converting help for package ‘NIADic’
    finding HTML links … done
    get_dic html
    • building package indices
    • installing vignettes
    • testing if installed package can be loaded from temporary location
    • testing if installed package can be loaded from final location
    • testing if installed package keeps a record of temporary installation path
  • DONE (NIADic)
    1213109 words dictionary was built.




명사 분리기 설치 후 확인

New

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

text = "뿌리산업’의 기반이 되는 공정기술의 범위가 관련법 제정 10년 만에 확대 개편된다. 뿌리기업 우대 지원과 청년층 등 신규인력 유입 지원을 강화하기 위한 법적 토대도 마련된다.
산업통상자원부는 이 같은 내용을 담은 ‘뿌리산업 진흥과 첨단화에 관한 법률(뿌리산업법) 시행령’ 개정안이 14일 국무회의에서 의결돼 오는 16일부터 시행된다고 밝혔다.
먼저 뿌리산업법 기반 공정기술(뿌리기술)의 범위가 기존 6개(주조, 금형, 소성가공, 용접, 표면처리, 열처리)에서 14개로 늘어난다.
구체적으로 소재 다원화 공정기술에 사출·프레스, 정밀가공, 적층제조, 산업용 필름 및 지류 등 4개 기술이 포함된다. 산업부는 이를 통해 세라믹, 플라스틱, 탄성소재, 탄소, 펄프 등 다양한 소재 기반 제조 공정을 확산할 계획이다. 또 지능화 공정기술로 로봇, 센서, 산업 지능형 소프트웨어, 엔지니어링 설계 등 4개 기술이 추가된다.
뿌리기술 범위가 확대되면서 뿌리산업의 범위도 기존 6대 산업, 76개 업종에서 14대 산업, 111개 업종으로 늘어난다.

이번 개정을 통해 뿌리기업 확인 절차, 확인서 유효기간(3년), 사후관리 등에 관한 규정도 신설됐다. 뿌리기업은 뿌리기술을 활용해 사업을 영위하는 업종 또는 뿌리기술에 활용되는 장비 제조 분야를 말한다.
뿌리기업 확인 제도는 외국인 근로자 고용 우대 혜택 등이 주어지는 뿌리산업 관련 우대 지원 대상을 명확히 정하기 위한 것으로 국가뿌리산업진흥센터에서 확인서를 발급해오고 있다. 2012년부터 1만1766건이 발급됐으며 현재 5843건이 유효한 것으로 집계됐다.

‘일하기 좋은 뿌리기업’ 선정을 위한 기준과 절차, 지원 내용 등에 관한 규정도 새로 만들어졌다. ‘일하기 좋은 뿌리기업’은 뿌리산업에 청년층 등 신규 인력 유입을 촉진하기 위해 근로·복지 환경, 성장 역량 등이 우수한 기업을 산업부가 선정해 홍보 등을 지원하는 제도다.
산업부는 이번 개정 사항이 원활히 시행될 수 있도록 업종별 협·단체, 뿌리기업, 지자체 등을 대상으로 적극 홍보할 방침이다. 아울러 매년 발간하는 뿌리산업 백서를 통해 새롭게 추가되는 8대 차세대 공정기술에 대한 내용, 기술 동향 등을 상세하게 제공하기로 했다.
산업부 관계자는 “이번 개정은 2011년 뿌리산업법 제정 후 10년 만에 뿌리기술을 소재다원화와 지능화 중심으로 확장한 것으로, 뿌리산업의 기술 융복합화와 첨단화를 촉진하고 신규 인력 유입 지원을 강화하기 위한 법적 토대를 마련하였다는 데에 의의가 있다”고 말했다."

extractNoun(text)



extractNoun(text)
[1] “뿌리산업’” “기반” “공정기술”
[4] “범위” “관련” “법”
[7] “제정” “10” “년”
[10] “만” “확대” “개편”
[13] “뿌리” “기업” “우대”
[16] “지원” “청년층” “등”
[19] “신규인력” “유입” “지원”
[22] “강화” “하기” “토대”
[25] “마련” “산업” “통상”
[28] “자원” “부” “내용”
[31] “담” “‘뿌리산업” “진흥”
[34] “첨단화” “법률(뿌리산업법)” “시행령’”
[37] “개정안” “14” “국무회의”
[40] “의결” “16” “일”
[43] “시행” “뿌리” “산업”
[46] “법” “기반” “공정기술(뿌리기술)”
[49] “범위” “기존” “6개(주조”
[52] “금형” “소성” “가공”
[55] “용접” “표면처리” “열처리”
[58] “14” “개” “구체”
[61] “적” “소재” “다원화”
[64] “공정기술” “사출·프레스” “정밀가공”
[67] “적층” “제조” “산업용”
[70] “필름” “지류” “등”
[73] “4” “개” “기술”
[76] “포함” “산업” “부”
[79] “이” “세라믹” “플라스틱”
[82] “탄성소” “재” “탄소”
[85] “펄프” “등” “다양”
[88] “한” “소재” “기반”
[91] “제조” “공정” “확산”
[94] “할” “계획” “지능화”
[97] “공정기술” “로봇” “센서”
[100] “산업” “지능형” “소프트웨어”
[103] “엔지니어링” “설계” “등”
[106] “4” “개” “기술”
[109] “추가” “뿌리” “기술”
[112] “범위” “확대” “되”
[115] “뿌리” “산업” “범위”
[118] “기존” “6” “대”
[121] “산업” “76” “개”
[124] “업종” “14” “대”
[127] “산업” “111” “개”
[130] “업종” “이번” “개정”
[133] “뿌리” “기업” “확인”
[136] “절차” “확인” “유효”
[139] “기” “3” “년”
[142] “사후관리” “등” “규정도”
[145] “신설” “뿌리” “기업”
[148] “뿌리” “기술” “활용”
[151] “해” “사업” “영위”
[154] “하” “업종” “뿌리”
[157] “기술” “활용” “되”
[160] “장비” “제조” “분야”
[163] “말” “뿌리” “기업”
[166] “확인” “제” “외국”
[169] “근로자” “고용” “우대”
[172] “혜택” “등” “뿌리”
[175] “산업” “관련” “우대”
[178] “지원” “대상” “것”
[181] “국가” “뿌리” “산업진흥”
[184] “센터” “확인서” “발급”
[187] “해오” “2012” “년”
[190] “1” “만” “1766”
[193] “건” “발급” “5843”
[196] “건” “유효” “한”
[199] “것” “집계” “‘일하기”
[202] “뿌리기업’” “선정” “기준”
[205] “절차” “지원” “내용”
[208] “등” “규정도” “‘일하기”
[211] “뿌리기업’은” “뿌리” “산업”
[214] “청년층” “등” “신규”
[217] “인력” “유입” “촉진”
[220] “하기” “근로·복지” “환경”
[223] “성장” “역량” “등”
[226] “우수” “한” “기업”
[229] “산업” “부” “선정”
[232] “해” “홍보” “등”
[235] “지원” “하” “제도”
[238] “산업” “부” “이번”
[241] “개정” “사항” “시행”
[244] “수” “업종” “별”
[247] “협·단체” “뿌리” “기업”
[250] “지자체” “등” “대상”
[253] “적극” “홍보” “할”
[256] “방침” “발간” “하”
[259] “뿌리” “산업” “백서”
[262] “추가” “되” “8”
[265] “대” “차세대” “공정기술”
[268] “내용” “기술” “동향”
[271] “등” “상세” “하게”
[274] “제공” “하기” “산업”
[277] “부” “관계자” ““이번”
[280] “개정” “2011” “년”
[283] “뿌리” “산업” “법”
[286] “제정” “후” “10”
[289] “년” “만” “뿌리”
[292] “기술” “소재” “다원화”
[295] “지능화” “중심” “확장”
[298] “한” “것” “뿌리”
[301] “산업” “기술” “융복합”
[304] “화” “첨단화” “촉진”
[307] “신규” “인력” “유입”
[310] “지원” “강화” “하기”
[313] “토대” “마련” “데”
[316] “의의” “있다”고” “말”

- 명사 분리기 설치 끝

R for DS_03 ggplot2

Welcome


  • 저작권 : “R for DataScience by Hadley Wickham and Garrett Grolemund(O’Reilly). Copyright 2017 Garrett Grolemund, Hadley Wickham, 978-1-491-91039-9





Introduction


  • how to visualise your data using ggplot2. R
  • ggplot2는 그래프를 그려주는 프로그램
    ggplot2 이론배경

3.1.1 Prerequisites


1
2
3
4
5
6
7
8
9
10
11
12
13
install.packages("tidyverse")

library(tidyverse)
#> ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
#> ✔ ggplot2 3.3.2 ✔ purrr 0.3.4
#> ✔ tibble 3.0.3 ✔ dplyr 1.0.2
#> ✔ tidyr 1.1.2 ✔ stringr 1.4.0
#> ✔ readr 1.4.0 ✔ forcats 0.5.0
#> ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
#> ✖ dplyr::filter() masks stats::filter()
#> ✖ dplyr::lag() masks stats::lag()

install.packages(c("nycflights13", "gapminder", "Lahman"))

3.2 First steps


3.2.1 The mpg data frame

  • US Environmental Protection Agency on 38 models of car
  • A data frame is a rectangular
1
mpg

mpg

  • displ = car’s engine size, in litres
  • hwy = fuel efficiency in miles per gallon (mpg)

3.2.2 Creating a ggplot

1
2
ggplot(data = mpg) +
geom_point(mapping = aes(x= displ, y = hwy))
  • ggplot(data = mpg) : 비어있는 Graph를 만들어 준다.
  • geom_point() : Layers 추가
  • scatterplot

scatterplot_mpg

  • mapping = aes(x= displ, y = hwy) : x와 y를 mapping 해 준다.

3.2.3 A graphing template

ggplot(data = <DATA>) + 
  <GEOM_FUNCTION>(mapping = aes(<MAPPINGS>))

이런 식으로 쓰면 된다고 함. (모형)

3.3 Aesthetic mappings


aesthetic : 래전드 모양, 색 크기

  • value : data
  • level : aesthetic properties
  • size : 크기
  • color = colour, aesthetic의 색
  • alpha = shape , aesthetic의 모양
1
2
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy, color = class))

mpg_color

1
2
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy, size = class))

mpg_Size

colour , color : 모두 써도 됨.

1
2
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y= hwy, alpha = class))

mpg_alpha

1
2
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy, shape = class))

mpg_shape


(수동으로 색 설정)

래전드를 생성 하지 않으면서 color만 바꿀 수 있다.

1
2
ggplot(data = mpg) + 
geom_point(mapping = aes(x = displ, y = hwy), color = "blue")

mpg3.1_Bulet25



1
2
ggplot(data = mpg) + 
geom_point(mapping = aes(x = displ, y = hwy, color = "blue"))

mpg 내의 data가 color이라는 column이 있다.

그 data가 “blue”인 data들의 displ과 hwy의 Graph

mpg_Color_Blue


아직 덜 했다 !

rmarkdown_Book

R publishing

Rggplot
RPubs

R for DS_01 welcome & Introduction

Welcome


R_Welcome

  • 저작권 : “R for DataScience by Hadley Wickham and Garrett Grolemund(O’Reilly). Copyright 2017 Garrett Grolemund, Hadley Wickham, 978-1-491-91039-9





Introduction


Data science is an exciting discipline that allows you to turn raw data into understanding, insight, and knowledge. The goal of “R for Data Science” is to help you learn the most important tools in R that will allow you to do data science. After reading this book, you’ll have the tools to tackle a wide variety of data science challenges, using the best parts of R.

R을 이용한 data science를 해 봅시다.

1.1 What you will learn


R4ds_Learn

1.2 How this book is organised

1.3 What you won’t learn

1.4 Prerequisites

1.4.1 R

  • 설치 해야 할 programs : 확인
    Rprograms_install

1.4.3 The tidyverse

1
install.packages("tidyverse")
1
2
3
4
5
6
7
8
9
library(tidyverse)
#> ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
#> ✔ ggplot2 3.3.2 ✔ purrr 0.3.4
#> ✔ tibble 3.0.3 ✔ dplyr 1.0.2
#> ✔ tidyr 1.1.2 ✔ stringr 1.4.0
#> ✔ readr 1.4.0 ✔ forcats 0.5.0
#> ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
#> ✖ dplyr::filter() masks stats::filter()
#> ✖ dplyr::lag() masks stats::lag()
  • tidyverse : ggplot2, tibble, tidyr, readr, purrr, and dplyr packages

    1.4.4 Other packages

    1
    install.packages(c("nycflights13", "gapminder", "Lahman"))

1.5 Running R code

1.6 Getting help and learning more

1.7 Acknowledgements

1.8 Colophon

My_rpubs


ref.

introduction

Part 3 more