Code: Select all
http://www.goldalert.com/
this is the code, I commented every line becuase im a bit paranoid about comments, and i like every thing well commented (any feed back in enhancing the code is appreciated):
Code: Select all
import urllib2
import re
import time
while 1:
#reading web page source, and making a list of lines (source lines)
res = urllib2.urlopen("http://www.goldalert.com/")
lines = res.readlines()
#getting the line that has the gold data
for i in lines:
if i.find('<div id="gp">') > -1:
g_l = lines[lines.index(i)]
l_g_p = re.findall('[\d\.]{1,50}', g_l)
gold_price = l_g_p[0]
#getting the line that has the change data
for i in lines:
if i.find('<div id="chg"') > -1:
c_l = lines[lines.index(i)]
l_e_c_n = re.findall('[\+\-][\d\.]{1,10}', c_l)
change_number = l_e_c_n[0]
l_e_c_p = re.findall('[\+\-][\d\.]{1,10}[\%]', c_l)
change_percent = l_e_c_p[0]
#getting the line that has the time data
for i in lines:
if i.find('<div id="tm">') > -1:
time_line = lines[lines.index(i)]
time_p = re.sub('\t<div id="tm"> ', '',time_line)
time = re.sub('</div>', '',time_p)
#managing data
final = gold_price+' '+change_number+' '+change_percent+' '+time
msn_1 = gold_price+' '+'[c=4]'+change_number+' '+change_percent+'[/c]'+' '+time #negative = red
msn_2 = gold_price+' '+'[c=3]'+change_number+' '+change_percent+'[/c]'+' '+time #positive = green
msn_0 = gold_price+' '+'[c=1]'+change_number+' '+change_percent+'[/c]'+' '+time #positive = green
#opening output file
fileHandle2 = open ( 'output.txt', 'a' )
#writing data to output file
if change_number < 0:
fileHandle2.write(msn_1)
elif change_number == 0:
fileHandle2.write(msn_0)
elif change_number > 0:
fileHandle2.write(msn_2)
#closing output file
fileHandle2.close()
time.sleep(60)
I used the urllib2 library to fetch the HTML code, is there any other way to get the HTML source ? I'm gonna add more stuff to the script. so for now I would like to get any feedback if possible.