Notes on indexing YouTube math channels
Off the top of my head, these are the types of channels I've encountered:
Channels where we want to pick out what videos to keep.
Channels where we want to pick out which videos to exclude. For example, most of Eddie Woo's channel consists of high-quality math videos, but there are a few promotional videos, which we would like to exclude.
Channels where we want to pick out just a few playlists. Maybe only a few playlists are dedicated to math, and all other videos are non-mathematical. For example, a university might publish lots of videos, each belonging to a course playlist. Some of those courses are math related, but most are not.
Channels where videos or playlists are titled according to some sort of pattern. The pattern can be detected by matching the title against a regex. In case the titles can't be matched against a regex, or using a regex would be difficult, we could pass the title to a total predicate instead.
Each channel could have a set of commands, where each command is either a name, or name-value pair. The include commands would be executed first, then the exclude commands.
include_video: yt_video_id include_playlist: yt_playlist_id include_all_playlists include_videos_by_regex: regex include_playlists_by_regex: regex include_videos_by_predicate: predicate include_playlists_by_predicate: predicate include_all_videos exclude_video: yt_video_id exclude_playlist: yt_playlist_id exclude_video_by_regex: regex exclude_playlist_by_regex: regex