Web Scraping with Python

 

In this workshop, you will learn how to extract data from websites using Python — a process known as web scraping. You will learn how to use various Python packages to retrieve the HTML of webpages and to extract specific content from both static and dynamic webpages.  

Difficulty rating: ★★★☆ Intermediate

Who is it for?

Both research staff and research students. Users who are already familiar with the basics of Python.

Summary of the topics covered

  • Recap of how websites are structured using HTML
  • How to extract information from HTML data using the BeautifulSoup package
  • How to retrieve the HTML of a webpage using the requests package
  • The difference between static and dynamic webpages
  • How to scrape dynamic content using Selenium

Prerequisites

This workshop is intended for learners who already have a basic understanding of Python. In particular, you should be comfortable with:

    • Install and import packages and modules
    • Use lists and dictionaries
    • Use conditional statements (if, else, elif)
    • Use for loops
    • Calling functions, understanding parameters/arguments and return values

Frequency

3 times a year

Duration

3 hours

Next course

Wednesday 18th March (13:00 - 16:00)

Book here

Can't attend?

 

Help with Python

If you need help with using Python, you can contact our Research Software Group for advice. Or you can raise a ServiceDesk ticket with us, or attend one of our drop-in sessions.