python - UnicodeDecodeError: 'utf8' codec can't decode byte - Euro Symbol -


i build connection google finance api gives me stock quotes. working fine until switch courses europe. these contain € symbol , following error:

traceback (most recent call last):   file "c:\users\administrator\desktop\getquotes.py", line 32, in <module>     quote = c.get("sap","fra")   file "c:\users\administrator\desktop\getquotes.py", line 21, in     obj = json.loads(content[3:])   file "c:\python27\lib\json\__init__.py", line 338, in loads     return _default_decoder.decode(s)   file "c:\python27\lib\json\decoder.py", line 365, in decode     obj, end = self.raw_decode(s, idx=_w(s, 0).end())   file "c:\python27\lib\json\decoder.py", line 381, in raw_decode     obj, end = self.scan_once(s, idx) unicodedecodeerror: 'utf8' codec can't decode byte 0x80 in position 0: invalid start byte 

the following code using. guess error appears while json trying processing string can not resolve euro symbol:

import urllib2 import json import time  class googlefinanceapi:     def __init__(self):         self.prefix = "http://finance.google.com/finance/info?client=ig&q="      def get(self,symbol,exchange):         url = self.prefix+"%s:%s"%(exchange,symbol)         u = urllib2.urlopen(url)         content = u.read()          obj = json.loads(content[3:])         return obj[0]   if __name__ == "__main__":     c = googlefinanceapi()      while 1:         quote = c.get("msft","nasdaq")         print quote         time.sleep(30) 

this how google finance gives me output sap stock containing euro symbol:

// [ { "id": "8424920" ,"t" : "sap" ,"e" : "fra" ,"l" : "56.51" ,"l_cur" : "€56.51" ,"s": "0" ,"ltt":"8:00pm gmt+2" ,"lt" : "aug 7, 8:00pm gmt+2" ,"c" : "-0.47" ,"cp" : "-0.82" ,"ccol" : "chr" } ] 

i tried use function , instead of opener (content[3:]) part got same error, instead of utf-8 got ascii error.

json.loads(unicode(opener.open(...), "iso-8859-15")) 

if has idea happy.

the document you're fetching appears encoded windows codepage 1252, euro sign character encoded \x80. that's invalid byte in utf-8 , non-printing control character in iso-8859 variants. try:

obj = json.loads(content[3:], 'cp1252') 

Comments

Popular posts from this blog

plot - Remove Objects from Legend When You Have Also Used Fit, Matlab -

java - Why does my date parsing return a weird date? -

Need help in packaging app using TideSDK on Windows -