python - Beautifulsoup url loading error -


so trying content of this page using beautiful soup. want create dictionary of css color names , seemed quick , easy way access this. naturally did quick basic:

from bs4 import beautifulsoup bs url = 'http://www.w3schools.com/cssref/css_colornames.asp' soup = bs(url) 

for reason getting url in p tag inside body , that's it:

>>> print soup.prettify() <html>  <body>   <p>    http://www.w3schools.com/cssref/css_colornames.asp   </p>  </body> </html> 

why wont beautifulsoup give me access information need?

beautifulsoup not load url you.

you need pass in full html page, means need load url first. here sample using urllib2.urlopen function achieve that:

from urllib2 import urlopen bs4 import beautifulsoup bs  source = urlopen(url).read() soup = bs(source) 

now can extract colours fine:

css_table = soup.find('table', class_='reference') row in css_table.find_all('tr'):     cells = row.find_all('td')     if cells:         print cells[0].a.text, cells[1].a.text 

Comments

Popular posts from this blog

plot - Remove Objects from Legend When You Have Also Used Fit, Matlab -

java - Why does my date parsing return a weird date? -

Need help in packaging app using TideSDK on Windows -