• 0

some errors in a simple Python parser-script - that makes use of BS4


Question

 

 

hello dear all

 

just to do some approach to python i have tried to get some data from yahoo .- with the following script ..

 

note the while look should fit the aim that the data were fetched constantly - guess that this part is working - but i get some errors.

 

 

import bs4
import requests
from bs4 import BeautifulSoup



def parsePrice():
r=requests.get('http://finance.yahoo.com/quote/FB?p=FBoup=bs4.BeautifulSoup(r.text,"xml")
price=soup.find_all('div',{'class':'My(6px) Post(r) smartphone_Mt(6px)'})[0].find('span').text
return price



while True:
print('the current price_ '+str (parsePrice()))




 

 

 

but there is missing some thing . getting errors all the way - in my ATOM - Editor 

 

 

Traceback (most recent call last):
File "C:\Users\Kasper\Documents\_f_s_j\_mk_\_dev_\bs\yahoo_finance.py", line 14, in <module>
print('the current price_ '+str (parsePrice()))
File "C:\Users\Kasper\Documents\_f_s_j\_mk_\_dev_\bs\yahoo_finance.py", line 10, in parsePrice
price=soup.find_all('div',{'class':'My(6px) Post(r) smartphone_Mt(6px)'})[0].find('span').text
IndexError: list index out of range

 

 

any idea what goes wrong here 

1 answer to this question

Recommended Posts

  • 0

The problem is that the requested URL uses a bunch of javascript to render the page in html when the page is requested, so you need to set the bs4 parser to 'html.parser' instead of xml. But! This is a webpage rendered for display, not parsing. The div classes are very convoluted and may differ based on your request headers (which you never set). Because of the html, I find it easier to just parse it using string methods rather than bs4. After that, you can just convert it to a json dictionary and return whatever you want. Here's a quick example that prints the price quote every minute with a timestamp:

 

import requests
import json
from datetime import datetime
import time


def priceParse(stock):
    url = 'https://finance.yahoo.com/quote/' + stock
    html = requests.get(url).text
    json_str = html.split('root.App.main =')[1].split('(this)')[0].split(';\n}')[0].strip()
    data = json.loads(json_str)['context']['dispatcher']['stores']['QuoteSummaryStore']
    latest_price = data['price']['regularMarketPrice']['fmt']
    market_state = data['price']['marketState']
    timestamp = datetime.now().strftime("%m/%d/%Y, %H:%M:%S")
    printed_str = stock + ' — $' + latest_price + ' (market state: ' + market_state + ') — ' + timestamp
    print(printed_str)
    return


while True:
    priceParse('FB')
    time.sleep(60)

 

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.
  • Posts

    • It's significant growth for Linux considering the market share, so it could have had that effect I described.
    • Microsoft and Crowdstrike announce partnership on threat actor naming by Pradeep Viswanathan Whenever a cyberattack is discovered, companies disclose it to the public and assign it a unique name based on their internal procedures. Unfortunately, this leads to inconsistencies, as each company has its own naming conventions. As a result, the same threat actor behind a cyberattack may end up with multiple names, causing delays and confusion in response efforts. For example, a threat actor that Microsoft refers to as Midnight Blizzard might be known as Cozy Bear, APT29, or UNC2452 by other security vendors. To address this issue, Microsoft and CrowdStrike are teaming up. These companies will align their individual threat actor taxonomies to help security professionals respond to cyberattacks with greater clarity and confidence. It’s important to note that Microsoft and CrowdStrike are not attempting to create a single naming standard. Instead, they are releasing a mapping that lists common threat actors tracked by both companies, matched according to their respective taxonomies. The mapping also includes corresponding aliases from each group’s naming system. You can view the joint threat actor mapping by Microsoft and CrowdStrike here. Although this threat actor taxonomy mapping is a joint effort between Microsoft and CrowdStrike, Google/Mandiant and Palo Alto Networks' Unit 42 are expected to contribute to this initiative in the future. Vasu Jakkal, Corporate Vice President of Microsoft Security, wrote the following about this collaboration with CrowdStrike: As more organizations join this initiative, the collective defense against cyber threats will undoubtedly be improved.
    • You make no sense since most of the stuff on YouTube is free to begin with. Comparing Netflix to YouTube is not even remotely the same. YouTube has tons of free videos to begin with, unlike Netflix, you are paying Netflix for original style of programming.
    • Youtube can go screw themselves. Never ever, ever will I pay for this BS nonsense. And I encourage anyone else not to either.
  • Recent Achievements

    • Week One Done
      Nullun earned a badge
      Week One Done
    • First Post
      sultangris earned a badge
      First Post
    • Reacting Well
      sultangris earned a badge
      Reacting Well
    • First Post
      ClarkB earned a badge
      First Post
    • Week One Done
      Epaminombas earned a badge
      Week One Done
  • Popular Contributors

    1. 1
      +primortal
      172
    2. 2
      ATLien_0
      125
    3. 3
      snowy owl
      122
    4. 4
      Xenon
      116
    5. 5
      +Edouard
      93
  • Tell a friend

    Love Neowin? Tell a friend!