python - BeautifulSoup4 Get_Text() Removing Line Breaks? -
i'm having bit of issue get_text() compared able .net think it's removing line breaks makes impossible me parse data.
i'm reading class site in kind of format:
4.1 £22 4 £27 3.9 £29 3.8 £106 3.75 £24 3.7 £24
it'll follow same format: decimal price decimal price decimal price etc...
i've done in .net , element.innertext has returned string line break in between.
i able like:
dim spltexample string() = element.innertext.split(new string() {environment.newline}, stringsplitoptions.none)
it put decimal , price each result. seeing:
4.1 £22\n 4 £27\n 3.9 £29\n 3.8 £106\n 3.75 £24\n 3.7 £24
my problem bs4 seems getting in different format - , i'm hoping can change.
4.1 £224 £273.9 £29 3.8 £1063.75 £243.7 £24
it's squashing decimal preceding price removing new line.
the data can pretty awkward. numbers won't static , need know how many combinations of decimal , price there is. .net give me list like:
4.1 £22 4 £27 3.9 £29 3.8 £106 3.75 £24 3.7 £24
current code:
for result in soup.find_all("span", {"class" : "classname"}): list_of_results.append(result .get_text())
example output:
4.1 £224 £273.9 £29 3.8 £1063.75 £243.7 £24
this has left me pretty dead in water. there else can use data , leave line breaks intact can work them?
Comments
Post a Comment