mirror of
https://gitlab.dit.htwk-leipzig.de/fsr-im/tools/flatscraper.git
synced 2025-07-16 19:48:49 +02:00
docs: enhance README with improved formatting, features, installation instructions, and emojis
This commit is contained in:
81
README.md
81
README.md
@ -1,2 +1,81 @@
|
|||||||
# Flatscraper - A simple web scraper for flat listings
|
Below is a beautified version of your README with enhanced Markdown styling and emojis:
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
# Flatscraper - A Simple Web Scraper for Flat Listings 🔍🏠
|
||||||
|
|
||||||
|
**Flatscraper** is a lightweight web scraper that extracts flat listings from a specified URL. It leverages the [BeautifulSoup](https://www.crummy.com/software/BeautifulSoup/) library to parse HTML and capture essential details about flats—**title**, **price**, and **location**.
|
||||||
|
|
||||||
|
## 🚀 Features
|
||||||
|
|
||||||
|
- 🏠 **Scrapes flat listings** from a specified URL
|
||||||
|
- 🔍 **Extracts key details:** title, price, and location
|
||||||
|
- 💾 **Saves data** to a CSV file
|
||||||
|
- ⚙️ **Supports command-line arguments** for customization
|
||||||
|
- 🛠️ **Easy to use and modify** for different websites
|
||||||
|
- 🧪 **Includes a simple test case** for demonstration
|
||||||
|
- 📚 **Utilizes Python’s built-in libraries** along with BeautifulSoup for HTML parsing
|
||||||
|
- 🔔 **Discord Webhook integration** for notifications
|
||||||
|
|
||||||
|
## 🏢 Housing Providers
|
||||||
|
|
||||||
|
- **LWB**
|
||||||
|
- **Lipsia**
|
||||||
|
- **BGL**
|
||||||
|
- **VLW**
|
||||||
|
- **Wogetra**
|
||||||
|
|
||||||
|
## 📦 Requirements
|
||||||
|
|
||||||
|
You can run the bot natively on your machine or use a Docker image. The requirements include:
|
||||||
|
|
||||||
|
- **Python 3.6 or higher**
|
||||||
|
- **BeautifulSoup4**
|
||||||
|
- **Requests**
|
||||||
|
- **Pandas**
|
||||||
|
- **Discord Webhook** (optional, for notifications)
|
||||||
|
- **Docker** (optional, for containerization)
|
||||||
|
|
||||||
|
## 🛠️ Installation
|
||||||
|
|
||||||
|
### 1. Environment Setup
|
||||||
|
|
||||||
|
Ensure that the `.env` file is configured correctly. An example is available in the `sample.env` file. Copy it to `.env` and fill in the required values.
|
||||||
|
The `SAP_SESSIONID` and `COOKIE_SESSSION` are obtained after performing a search on the LWB website. Use your browser's developer tools to locate them in local storage.
|
||||||
|
*Future versions will include automatic form processing to obtain a valid session ID.*
|
||||||
|
|
||||||
|
### 2. Python Environment
|
||||||
|
|
||||||
|
You can use a virtual environment to install the necessary packages:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Create a virtual environment
|
||||||
|
python -m venv venv
|
||||||
|
|
||||||
|
# Activate the virtual environment
|
||||||
|
# On Windows:
|
||||||
|
venv\Scripts\activate
|
||||||
|
# On macOS and Linux:
|
||||||
|
source venv/bin/activate
|
||||||
|
|
||||||
|
# Install the required packages
|
||||||
|
pip install -r requirements.txt
|
||||||
|
```
|
||||||
|
|
||||||
|
### 3. Docker Environment
|
||||||
|
|
||||||
|
Alternatively, use the Docker image provided in the repository:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Build the Docker image
|
||||||
|
docker build -t flatscraper .
|
||||||
|
|
||||||
|
# Run the Docker container
|
||||||
|
docker run -it --rm flatscraper
|
||||||
|
```
|
||||||
|
|
||||||
|
## 🎉 Have Fun and Happy Scraping!
|
||||||
|
|
||||||
|
Wishing you a great time and speedy flat searching with the bot. If you have any questions or suggestions, feel free to [open an issue on GitLab](https://gitlab.com). I'll respond as soon as possible.
|
||||||
|
|
||||||
|
*Happy scraping! 🚀*
|
Reference in New Issue
Block a user