时光

时光的时光轴

寄以时间,予以文字
telegram
github
微信公众号

Incomplete Guide to Scraping IPTV Sources

It's the beginning of the year again, and a mysterious force is influencing me to make changes to my home network architecture. Two years ago at this time, I got the "Xiaomi Router AX3600," last year I got the "J4125 Soft Router," and this year I got the (a product from a certain mainland).

A product from a certain mainland

Of course, I have known about this product for more than a day or two, as for why I chose to get it this year...🤔 I'm not too sure either. There are already plenty of reviews and introductions about this product online, so if you're interested, you can search for it yourself.

But because this product can also be used as a "TV box," it is very important to set up TV channels.

Obtaining IPTV Sources#

There are many ready-made solutions for IPTV on the internet, some even include thousands of channels from around the world, and there are also paid sources that are more stable and maintained by professionals.

Originally, I thought about using public sources to save trouble, but then I remembered that I was given an IPTV box when I signed up for broadband at home. I quickly found it, connected it to the network, and turned it on. The system is a bit rubbish, but it has the IPTV resources I want, so let's get started!

Understanding the Requirements of the IPTV Box#

After some research, I found that different IPTV providers have different requirements. Some even provide a "VLAN" specifically for IPTV to ensure a better experience. You need to first determine whether your IPTV box has authorization based on "VLAN," "PPPoE," or other methods. If you have these requirements, it is highly likely that you will need to prepare something for forwarding in order to play in third-party IPTV software.

The implementation in this article is based on the assumption that "one-time authentication is required when obtaining the stream," so I'm quite lucky 😗. Let's continue!

Capturing and Analyzing Packets#

First, we need to capture packets to understand how the IPTV box obtains the live stream. Connect the box to a network that you can control. You can use a router with packet capture functionality or set up a network bridge. You can search for specific methods yourself. For example, the packet capture tool that comes with "AiKuai" is very useful:

The packet capture tool from AiKuai is very useful

Start capturing packets after turning off the box, then turn on the box and play a few channels randomly. You can stop capturing packets now and open them with Wireshark:

image

So many packets? Let's directly look for the live source. It is highly likely that the stream file used by IPTV is an "M3U" file, which is based on "HTTP" transmission. So we can filter the packets by "http" and the string "m3u":

image

We found one quickly. Let's try opening this URL directly in a player:

image

It works! Now, let's scroll up and look at the requests to see how the URL of this stream is generated.

image

This is the approximate request process for my device. Since the content format is not a subscription format that IPTV players can directly understand, we need something to replace the IPTV box to complete the request and convert the content into a format that the IPTV player can understand.

Obtaining and Processing IPTV Information#

After testing, it was found that the source of the request to the IPTV server of the service provider must be the IP of the local service provider. Therefore, we need to arrange a device in the local area network to implement request forwarding.

At first, I used a solution of tunneling through the internal network and using cloud functions (🤨I realized later that this was too complicated and unnecessary, maybe because I initially planned to watch it in an external network environment...)

I created a simple logic using PHP:

<?php

$server = "URL of the tunneling service";
$channel = explode("/channel/",$_SERVER["REQUEST_URI"],2);

header('Content-Type: application/vnd.apple.mpegurl');
header('Expires: 0');

if(count($channel) != 2){
    exit(make_m3u());
}
get_stream($channel[1]);

function make_m3u(){
    global $server;
    $file = "#EXTM3U".PHP_EOL;
    $endpoint = "Get channel list";
    $params = [
        Authentication information
    ];

    $list = file_get_contents($server.$endpoint."?".http_build_query($params));
    $list = json_decode($list,true);
    if($list['result']['reason'] != 'ok')
        return 'fail';
    
    $list = $list['channelList'];
    foreach($list as $channel){
        $file.= "#EXTINF:-1 ";
        if (isset($channel['callsign']))
            $file.="tvg-logo=\"".$channel['callsign']."\", ";
        $file.= $channel['channelName'];
        $file.= PHP_EOL."URL for parsing channel information".$channel['channelId'].PHP_EOL;
    }

    return $file;
}

function get_stream($channel){
    global $server;
    $endpoint = "Get live source";

    $params = [
        Authentication and channel information
    ];
    $ch = curl_init($server.$endpoint."?".http_build_query($params));
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    curl_setopt($ch, CURLOPT_HTTPHEADER, [
        'X-Forwarded-IP: '.$_SERVER['HTTP_X_FORWARDED_FOR']
    ]);
    $response = curl_exec($ch);
    curl_close($ch);
    $stream = json_decode($response,true);
    if($stream['result']['reason'] != 'ok')
        return 'fail';
    
    header('Location: ' . $stream['playAddress'], true, 302);
}
?>

It can parse and obtain the video stream correctly.
image

However, I found that this request was going around in circles on the public network, from home back to home... It was too slow and extremely inelegant.

But now I'm not very interested in having a server at home anymore (maybe I've lost interest), so I want to keep the deployment difficulty and complexity as low as possible. So, let's continue to optimize this process.

A Mention of the Pitfalls of Reverse Proxy#

image

The stream URL returned after authentication includes a "userIp" parameter. Through experimentation, it was found that this parameter is obtained from the "HTTP_X_FORWARDED" header, which is automatically added by some CDNs/forwarding services/web servers to distinguish the real IP address of the visitor for the final destination. Unfortunately, this header becomes a problem during authentication.

To solve this problem, it is very simple. I use "Caddy" to implement the reverse proxy, so I just need to add one line of configuration:

header_up X-Forwarded-For {http.request.header.Fake-IP}

Of course, you can define the name yourself and carry the "Fake-IP" header in the request to avoid having something in the request path add an IP address to it 😂

Using Go to Implement the Processing Part#

Let's go straight to the code:

package main

import (
	"encoding/json"
	"fmt"
	"log"
	"net/http"
	"net/url"
	"strings"
)

const sourceServer = "URL of the video stream server"
const listServer = "URL of the channel list server"

func main() {
	http.HandleFunc("/", getList)
	http.HandleFunc("/ch/", getChannel)

	log.Fatal(http.ListenAndServe(":80", nil))
}

func getList(w http.ResponseWriter, r *http.Request) {
	var data map[string]interface{}
	var list []interface{}
	var file strings.Builder

	file.WriteString("#EXTM3U\n")

	w.Header().Set("Content-Type", "application/vnd.apple.mpegurl")
	w.Header().Set("Expires", "0")

	params := url.Values{
		Authentication information
	}
	resp, err := http.Get(listServer + "?" + params.Encode())
	if err != nil {
		fmt.Fprint(w, err)
		return
	}
	defer resp.Body.Close()
	json.NewDecoder(resp.Body).Decode(&data)

	if data["result"].(map[string]interface{})["reason"].(string) != "ok" {
		fmt.Fprint(w, "fail")
	}
	list = data["channelList"].([]interface{})

	for _, channel := range list {
		file.WriteString("#EXTINF:-1")
		if channel.(map[string]interface{})["channelName"] == nil {
			continue
		}
		if channel.(map[string]interface{})["callsign"] != nil {
			file.WriteString(" tvg-logo=\"" + channel.(map[string]interface{})["callsign"].(string) + "\"")
		}
		file.WriteString("," + channel.(map[string]interface{})["channelName"].(string) + "\n")
		file.WriteString("\nhttp://IP of the terminal running the program/ch/" + channel.(map[string]interface{})["channelId"].(string) + "\n")
	}

	fmt.Fprint(w, file.String())
}

func getChannel(w http.ResponseWriter, r *http.Request) {
	channel := strings.TrimPrefix(r.URL.Path, "/ch/")
	var data map[string]interface{}

	params := url.Values{
		Authentication information
	}

	resp, err := http.Get(sourceServer + "?" + params.Encode())
	if err != nil {
		fmt.Fprint(w, err)
		return
	}
	defer resp.Body.Close()
	json.NewDecoder(resp.Body).Decode(&data)

	if data["result"].(map[string]interface{})["reason"].(string) != "ok" {
		fmt.Fprint(w, "fail")
	}

	http.Redirect(w, r, data["playAddress"].(string), http.StatusFound)
}

With this, you only need to put the program into a simple Linux environment (I just threw it into Docker), and you can run the conversion server on the soft router.

Enjoy your TV!

Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.