python - BeautifulSoup4 Get_Text() Removing Line Breaks? -


i'm having bit of issue get_text() compared able .net think it's removing line breaks makes impossible me parse data.

i'm reading class site in kind of format:

4.1 £22 4 £27 3.9 £29 3.8 £106 3.75 £24 3.7 £24  

it'll follow same format: decimal price decimal price decimal price etc...

i've done in .net , element.innertext has returned string line break in between.

i able like:

dim spltexample string() = element.innertext.split(new string() {environment.newline},                                            stringsplitoptions.none) 

it put decimal , price each result. seeing:

4.1 £22\n 4 £27\n 3.9 £29\n 3.8 £106\n 3.75 £24\n 3.7 £24  

my problem bs4 seems getting in different format - , i'm hoping can change.

4.1 £224 £273.9 £29 3.8 £1063.75 £243.7 £24 

it's squashing decimal preceding price removing new line.

the data can pretty awkward. numbers won't static , need know how many combinations of decimal , price there is. .net give me list like:

4.1 £22 4 £27 3.9 £29 3.8 £106 3.75 £24 3.7 £24 

current code:

for result in soup.find_all("span", {"class" : "classname"}):             list_of_results.append(result .get_text()) 

example output:

  4.1 £224 £273.9 £29 3.8 £1063.75 £243.7 £24  

this has left me pretty dead in water. there else can use data , leave line breaks intact can work them?


Comments

Popular posts from this blog

ruby - Trying to change last to "x"s to 23 -

jquery - Clone last and append item to closest class -

css - Can I use the :after pseudo-element on an input field? -