mirror of
https://github.com/aljazceru/crawler_v2.git
synced 2025-12-17 07:24:21 +01:00
added readme and package-level comments
This commit is contained in:
26
README.md
Normal file
26
README.md
Normal file
@@ -0,0 +1,26 @@
|
|||||||
|
# Crawler v2
|
||||||
|
|
||||||
|
This repo is a rewrite of the original and discontinued [crawler](https://github.com/vertex-lab/crawler), under active developement.
|
||||||
|
|
||||||
|
## Goals
|
||||||
|
|
||||||
|
The goals of this project are:
|
||||||
|
|
||||||
|
- Continuously crawl the Nostr network (24/7/365), searching for follow lists (`kind:3`) and other relevant events.
|
||||||
|
|
||||||
|
- Quickly assess whether new events should be added to the database based on the author's rank. Approved events are used to build a custom Redis-backed graph database.
|
||||||
|
|
||||||
|
- Generate and maintain random walks for nodes in the graph, updating them as the graph topology evolves.
|
||||||
|
|
||||||
|
- Use these random walks to efficiently compute acyclic Monte Carlo Pageranks (personalized and global). Algorithms are inspired by [this paper](snap.stanford.edu/class/cs224w-readings/bahmani10pagerank.pdf)
|
||||||
|
|
||||||
|
## Apps
|
||||||
|
|
||||||
|
`/cmd/crawler/`
|
||||||
|
|
||||||
|
The main entry point, which assumes that the event store and Redis are syncronized. In case they are empty, the graph will be initialized using the `INIT_PUBKEYS` specified in the enviroment.
|
||||||
|
|
||||||
|
`/cmd/sync/`
|
||||||
|
|
||||||
|
This mode builds the Redis graph database from the event store. In other words, it syncronizes Redis to reflect the events in the event store, starting from the `INIT_PUBKEYS` specified in the enviroment, and expanding outward.
|
||||||
|
|
||||||
@@ -18,6 +18,12 @@ import (
|
|||||||
"github.com/vertex-lab/relay/pkg/eventstore"
|
"github.com/vertex-lab/relay/pkg/eventstore"
|
||||||
)
|
)
|
||||||
|
|
||||||
|
/*
|
||||||
|
This programs assumes syncronization between Redis and the event store, meaning
|
||||||
|
that the graph in Redis reflects these events.
|
||||||
|
If that is not the case, go run /cmd/sync/ to syncronize Redis with the event store.
|
||||||
|
*/
|
||||||
|
|
||||||
func main() {
|
func main() {
|
||||||
ctx, cancel := context.WithCancel(context.Background())
|
ctx, cancel := context.WithCancel(context.Background())
|
||||||
defer cancel()
|
defer cancel()
|
||||||
|
|||||||
@@ -20,7 +20,7 @@ import (
|
|||||||
|
|
||||||
/*
|
/*
|
||||||
This program syncronize the Redis database to the events already stored in the event store.
|
This program syncronize the Redis database to the events already stored in the event store.
|
||||||
If Redis and the eventstore are already in sync, run the executable at /cmd/crawler/.
|
If Redis and the eventstore are already in sync, go run /cmd/crawler/.
|
||||||
*/
|
*/
|
||||||
|
|
||||||
func main() {
|
func main() {
|
||||||
|
|||||||
@@ -1,3 +1,4 @@
|
|||||||
|
// The config package loads and validates the variables in the enviroment into a [Config]
|
||||||
package config
|
package config
|
||||||
|
|
||||||
import (
|
import (
|
||||||
|
|||||||
@@ -1,3 +1,4 @@
|
|||||||
|
// The pipe package defines high-level pipeline functions (e.g. [Firehose], [Engine])
|
||||||
package pipe
|
package pipe
|
||||||
|
|
||||||
import (
|
import (
|
||||||
|
|||||||
Reference in New Issue
Block a user