beautifulsoup scraping paragraphs from html

Solutions on MaxInterview for beautifulsoup scraping paragraphs from html by the best coders in the world

showing results for - "beautifulsoup scraping paragraphs from html"

30 Aug 2018

1from bs4 import BeautifulSoup
2
3# Simple HTML
4SIMPLE_HTML = '''<html>
5<head></head>
6<body>
7<h1>This is a title</h1>
8<p class="subtitle">Lorem ipsum dolor sit amet.</p>
9<p>Here's another p without a class</p>
10<ul>
11    <li>Sarah</li>
12    <li>Mary</li>
13    <li>Charlotte</li>
14    <li>Carl</li>
15</ul>
16</body>
17</html>'''
18
19simple_soup = BeautifulSoup(SIMPLE_HTML, 'html.parser')      # use html.parser in order to understand the simple HTML
20
21# Find paragraph
22def find_paragraph():
23    print(simple_soup.find('p', {'class': 'subtitle'}).string)
24
25
26def find_other_paragraph():
27    paragraphs = simple_soup.find_all('p')                                                     # give all the paragraphs
28    other_paragraph = [p for p in paragraphs if 'subtitle' not in p.attrs.get('class', [])]    # iterate over the paragraphs and give back if not a class paragraph
29    print(other_paragraph[0].string)                                                           # attrs.get() give back None if paragraph not found
30                                                                                               # instead of None we return an empty list [] is case paragraph not found
31    
32find_paragraph() 
33find_other_paragraph()

source

similar questions

use beautifulsoup or scrapy to scrape a book store get all paragraph tags beautifulsoup getting heading from a webpage in beautifulsoup webbscraping website with beautifulsoup beautifulsoup scraping list from html get website content with beautifulsoup beautifulsoup getting data from a website scrapy get text custom tags beautifulsoup 28raw html how to extract data from website using beautifulsoup scrape all the p tags in a python scrapy pass string as html how to scrape data from a html page saved locally beautifulsoup python set text

queries leading to this page

beautifulsoup shrink html as you scrape it access video file with beautifulsoup beautifulsoup scrape html beautifulsoup scrap from article tag beautifulsoup to scrape html how to scrape paragraph from bneautifulsoup scraping article beautifulsoup how to scrape the a tag by using beautifulsoup beautifulsoup scrape text beautiful soup scrape script tag scrape paragraphs beautifulsoup web scraping to scrape text of a website using beautifulsoup how to used sam ocde multiple times for scrappig using beautiful soup beautifulsoup web scrape text identify the name of the most recently added data sets using beautifulsoup how to scrape a website which loads after some time bs4 scrape text with beautifulsoup bs4 webscraping example beautiful soup text scrape scrape paragraph only in beautifulsoup beautiful soup web scraping get real text beautiful soup scrape textaea how to scrape html with beautifulsoup extract paragraphs from web page beautifulsoup web scraping second paragprah scrape all paragraphs only in beautifulsoup scrape a website using beautiful soup get all the text in a website scraping text with beautifulsoup beautiful soup scrape for tag with text how to scrape the p tag by using beautifulsoup beautifulsoup scraping paragraphs from html