mirror of
https://gitlab.dit.htwk-leipzig.de/fsr-im/tools/flatscraper.git
synced 2025-07-15 19:18:49 +02:00
docs: enhance README with improved formatting, features, installation instructions, and emojis
This commit is contained in:
81
README.md
81
README.md
@ -1,2 +1,81 @@
|
||||
# Flatscraper - A simple web scraper for flat listings
|
||||
Below is a beautified version of your README with enhanced Markdown styling and emojis:
|
||||
|
||||
---
|
||||
|
||||
# Flatscraper - A Simple Web Scraper for Flat Listings 🔍🏠
|
||||
|
||||
**Flatscraper** is a lightweight web scraper that extracts flat listings from a specified URL. It leverages the [BeautifulSoup](https://www.crummy.com/software/BeautifulSoup/) library to parse HTML and capture essential details about flats—**title**, **price**, and **location**.
|
||||
|
||||
## 🚀 Features
|
||||
|
||||
- 🏠 **Scrapes flat listings** from a specified URL
|
||||
- 🔍 **Extracts key details:** title, price, and location
|
||||
- 💾 **Saves data** to a CSV file
|
||||
- ⚙️ **Supports command-line arguments** for customization
|
||||
- 🛠️ **Easy to use and modify** for different websites
|
||||
- 🧪 **Includes a simple test case** for demonstration
|
||||
- 📚 **Utilizes Python’s built-in libraries** along with BeautifulSoup for HTML parsing
|
||||
- 🔔 **Discord Webhook integration** for notifications
|
||||
|
||||
## 🏢 Housing Providers
|
||||
|
||||
- **LWB**
|
||||
- **Lipsia**
|
||||
- **BGL**
|
||||
- **VLW**
|
||||
- **Wogetra**
|
||||
|
||||
## 📦 Requirements
|
||||
|
||||
You can run the bot natively on your machine or use a Docker image. The requirements include:
|
||||
|
||||
- **Python 3.6 or higher**
|
||||
- **BeautifulSoup4**
|
||||
- **Requests**
|
||||
- **Pandas**
|
||||
- **Discord Webhook** (optional, for notifications)
|
||||
- **Docker** (optional, for containerization)
|
||||
|
||||
## 🛠️ Installation
|
||||
|
||||
### 1. Environment Setup
|
||||
|
||||
Ensure that the `.env` file is configured correctly. An example is available in the `sample.env` file. Copy it to `.env` and fill in the required values.
|
||||
The `SAP_SESSIONID` and `COOKIE_SESSSION` are obtained after performing a search on the LWB website. Use your browser's developer tools to locate them in local storage.
|
||||
*Future versions will include automatic form processing to obtain a valid session ID.*
|
||||
|
||||
### 2. Python Environment
|
||||
|
||||
You can use a virtual environment to install the necessary packages:
|
||||
|
||||
```bash
|
||||
# Create a virtual environment
|
||||
python -m venv venv
|
||||
|
||||
# Activate the virtual environment
|
||||
# On Windows:
|
||||
venv\Scripts\activate
|
||||
# On macOS and Linux:
|
||||
source venv/bin/activate
|
||||
|
||||
# Install the required packages
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
### 3. Docker Environment
|
||||
|
||||
Alternatively, use the Docker image provided in the repository:
|
||||
|
||||
```bash
|
||||
# Build the Docker image
|
||||
docker build -t flatscraper .
|
||||
|
||||
# Run the Docker container
|
||||
docker run -it --rm flatscraper
|
||||
```
|
||||
|
||||
## 🎉 Have Fun and Happy Scraping!
|
||||
|
||||
Wishing you a great time and speedy flat searching with the bot. If you have any questions or suggestions, feel free to [open an issue on GitLab](https://gitlab.com). I'll respond as soon as possible.
|
||||
|
||||
*Happy scraping! 🚀*
|
Reference in New Issue
Block a user