Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
327 views
in Technique[技术] by (71.8m points)

substring - split each character in R

I have song.txt file

*****
[1]"The snow glows white on the mountain tonight
Not a footprint to be seen."
[2]"A kingdom of isolation,
and it looks like I'm the Queen"
[3]"The wind is howling like this swirling storm inside
Couldn't keep it in;
Heaven knows I've tried"
*****
[4]"Don't let them in,
don't let them see"
[5]"Be the good girl you always have to be
Conceal, don't feel,
don't let them know"
[6]"Well now they know"
*****

I would like to loop over the lyrics and fill in the elements of each list as each element in the list contains a character vector, where each element of the vector is a word in the song.

like

[1] "The" "snow" "glows" "white" "on" "the" "mountain" "tonight" "Not" "a" "footprint"
    "to" "be" "seen." "A" "kingdom" "of" "isolation," "and" "it" "looks" "like" "I'm" "the"     
    "Queen" "The" "wind" "is" "howling" "like" "this" "swirling" "storm" "inside"
    "Couldn't" "keep" "it" "in" "Heaven" "knows" "I've" "tried"
[2]"Don't" "let" "them" "in,""don't" "let" "them" "see" "Be" "the" "good" "girl" "you"  
   "always" "have" "to" "be" "Conceal," "don't" "feel," "don't" "let" "them" "know"
   "Well" "now" "they" "know"

First I made an empty list with words <- vector("list", 2).

I think that I should first put the text into one long character vector where in relation to the delimiters ***** start and stop. with

star="\*{5}"
pindex = grep(star, page)

After this what should I do?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

It sounds like what you want is strsplit, run (effectively) twice. So, starting from the point of "a single long character string separated by **** and spaces" (which I assume is what you have?):

list_of_vectors <- lapply(strsplit(song, split = "\*{5}"), function(x) {

  #Split each verse by spaces
  split_verse <- strsplit(x, split = " ")

  #Then return it as a vector
  return(unlist(split_verse))

})

The result should be a list of each verse, with each element consisting of a vector of each word in that verse. Iff you're not dealing with a single character string in the read-in object, show us the file and how you're reading it in ;).


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...