Commit Graph

822 Commits

Author SHA1 Message Date
Moon Man a5b041c03b Merge remote-tracking branch 'upstream/qdrant-search-2' into spc2 2024-05-19 14:15:06 +00:00
Lain Soykaf c139a9f38c B Config: Set default Qdrant embedder to our fastembed-api server 2024-05-19 12:39:54 +04:00
Lain Soykaf 72ec261a69 B QdrantSearch: Switch to OpenAI api 2024-05-19 12:17:46 +04:00
Moon Man d83a15e879 Merge remote-tracking branch 'upstream/qdrant-search-2' into spc2 2024-05-14 13:50:45 +00:00
Lain Soykaf cd7e2138d1 Search: Basic Qdrant/Ollama search 2024-05-14 14:13:37 +04:00
Mark Felder d21aa1a77c Respect the TTL returned in OpenGraph tags 2024-05-07 19:54:56 -04:00
Mark Felder df0734fcbf Increase the :max_body for Rich Media to 5MB
Websites are increasingly getting more bloated with tricks like inlining content (e.g., CNN.com) which puts pages at or above 5MB. This value may still be too low.
2024-05-07 19:54:56 -04:00
Mark Felder ede414094f RichMedia refactor
Rich Media parsing was previously handled on-demand with a 2 second HTTP request timeout and retained only in Cachex. Every time a Pleroma instance is restarted it will have to request and parse the data for each status with a URL detected. When fetching a batch of statuses they were processed in parallel to attempt to keep the maximum latency at 2 seconds, but often resulted in a timeline appearing to hang during loading due to a URL that could not be successfully reached. URLs which had images links that expire (Amazon AWS) were parsed and inserted with a TTL to ensure the image link would not break.

Rich Media data is now cached in the database and fetched asynchronously. Cachex is used as a read-through cache. When the data becomes available we stream an update to the clients. If the result is returned quickly the experience is almost seamless. Activities were already processed for their Rich Media data during ingestion to warm the cache, so users should not normally encounter the asynchronous loading of the Rich Media data.

Implementation notes:

- The async worker is a Task with a globally unique process name to prevent duplicate processing of the same URL
- The Task will attempt to fetch the data 3 times with increasing sleep time between attempts
- The HTTP request obeys the default HTTP request timeout value instead of 2 seconds
- URLs that cannot be successfully parsed due to an unexpected error receives a negative cache entry for 15 minutes
- URLs that fail with an expected error will receive a negative cache with no TTL
- Activities that have no detected URLs insert a nil value in the Cachex :scrubber_cache so we do not repeat parsing the object content with Floki every time the activity is rendered
- Expiring image URLs are handled with an Oban job
- There is no automatic cleanup of the Rich Media data in the database, but it is safe to delete at any time
- The post draft/preview feature makes the URL processing synchronous so the rendered post preview will have an accurate rendering

Overall performance of timelines and creating new posts which contain URLs is greatly improved.
2024-05-07 19:54:56 -04:00
Moon Man 5d4913bb93 Merge remote-tracking branch 'origin/rich-media-db' into spc2 2024-05-05 13:06:01 -05:00
Mark Felder b067fbde31 Respect the TTL returned in OpenGraph tags 2024-05-05 13:51:13 -04:00
Mark Felder 2079e92c5c Increase the :max_body for Rich Media to 5MB
Websites are increasingly getting more bloated with tricks like inlining content (e.g., CNN.com) which puts pages at or above 5MB. This value may still be too low.
2024-05-05 13:51:13 -04:00
Mark Felder a6407f9ba5 RichMedia refactor
Rich Media parsing was previously handled on-demand with a 2 second HTTP request timeout and retained only in Cachex. Every time a Pleroma instance is restarted it will have to request and parse the data for each status with a URL detected. When fetching a batch of statuses they were processed in parallel to attempt to keep the maximum latency at 2 seconds, but often resulted in a timeline appearing to hang during loading due to a URL that could not be successfully reached. URLs which had images links that expire (Amazon AWS) were parsed and inserted with a TTL to ensure the image link would not break.

Rich Media data is now cached in the database and fetched asynchronously. Cachex is used as a read-through cache. When the data becomes available we stream an update to the clients. If the result is returned quickly the experience is almost seamless. Activities were already processed for their Rich Media data during ingestion to warm the cache, so users should not normally encounter the asynchronous loading of the Rich Media data.

Implementation notes:

- The async worker is a Task with a globally unique process name to prevent duplicate processing of the same URL
- The Task will attempt to fetch the data 3 times with increasing sleep time between attempts
- The HTTP request obeys the default HTTP request timeout value instead of 2 seconds
- URLs that cannot be successfully parsed due to an unexpected error receives a negative cache entry for 15 minutes
- URLs that fail with an expected error will receive a negative cache with no TTL
- Activities that have no detected URLs insert a nil value in the Cachex :scrubber_cache so we do not repeat parsing the object content with Floki every time the activity is rendered
- Expiring image URLs are handled with an Oban job
- There is no automatic cleanup of the Rich Media data in the database, but it is safe to delete at any time
- The post draft/preview feature makes the URL processing synchronous so the rendered post preview will have an accurate rendering

Overall performance of timelines and creating new posts which contain URLs is greatly improved.
2024-05-05 13:51:13 -04:00
Mark Felder 7f97fbc1ae Update minimum Postgres version to 11.0; disable JIT
This release is where JIT was introduced and it should be disabled. Pleroma's queries do not benefit from JIT, but it can increase latency of queries.
2024-03-18 15:36:26 -04:00
marcin mikołajczak 9cfa4e67b1 Add ForceMention mrf
Signed-off-by: marcin mikołajczak <git@mkljczk.pl>
2024-03-01 18:16:09 +01:00
Moon Man 9ca62f74be Merge remote-tracking branch 'origin/logger-metadata' into spc2 2024-02-28 12:42:19 -06:00
Mark Felder 64ad451a7b Websocket refactor to use Phoenix.Socket.Transport
This will make us compatible with Cowboy and Bandit
2024-02-14 15:27:07 -05:00
Mark Felder 653b14e1c7 Use config to control Uploader callback timeout 2024-01-22 18:37:13 -05:00
Mark Felder 17877f612e Use config to control streamer registry 2024-01-20 18:51:20 -05:00
Mark Felder 4bb57d4f25 Use config to control background migrators 2024-01-20 18:47:25 -05:00
Mark Felder c7eda0b24a Use config to control loading of custom modules 2024-01-20 18:43:53 -05:00
Mark Felder 029aaf3d74 Use config to control max_restarts 2024-01-20 18:41:04 -05:00
Mark Felder 1d816222e0 Remove support for multiple federation publisher modules
This also unravels some needless indirection.
2023-12-28 11:55:19 -05:00
Mark Felder 241c7175bd Logger metadata for request path and authenticated user 2023-12-17 18:20:22 -05:00
Mark Felder f01ad493f3 Logger metadata for inbound federation requests 2023-12-09 18:32:26 -05:00
lain ef7bda61ad Merge branch 'promex' into 'develop'
Switch to PromEx for prometheus metrics

See merge request pleroma/pleroma!3967
2023-11-28 07:50:16 +00:00
Henry Jameson a5aa8ea796 Add support for configuring a favicon and embed PWA manifest in server-generated-meta 2023-11-14 11:05:23 +01:00
Mark Felder 66cb3294ed Switch to PromEx for prometheus metrics
Recommending use of the separate HTTP server for exposing the metrics
and securing it externally on your firewall or reverse proxy. It will
listen on port 4021 by default.
2023-11-13 15:34:59 -05:00
lain 5f19fbc5a9 Merge branch 'phoenix1.7' into 'develop'
Update to Phoenix 1.7

See merge request pleroma/pleroma!3900
2023-11-12 13:34:27 +00:00
Lain Soykaf 0c5cc51983 Merge branch 'develop' of git.pleroma.social:pleroma/pleroma into pleroma-meilisearch 2023-11-12 13:53:18 +04:00
Mark Felder a0e08c6ec2 Merge branch 'develop' into phoenix1.7 2023-11-07 16:05:04 -05:00
Mark Felder bf426c53b4 Fix digest email processing, consolidate Oban queues
The email related jobs can all share a single Oban queue
2023-11-07 15:14:36 -05:00
tusooa 163e563733
Allow more flexibility in InlineQuotePolicy 2023-09-13 19:19:05 -04:00
Alex Gleason 93e4972b50
Add InlineQuotePolicy as a default MRF 2023-09-13 19:19:04 -04:00
Alex Gleason 57ef1d1211
Add InlineQuotePolicy to force quote URLs inline 2023-09-13 19:19:04 -04:00
tusooa 28ff828caa
Add emoji policy to remove emojis matching certain urls
https://git.pleroma.social/pleroma/pleroma/-/issues/2775
2023-07-07 06:58:22 -04:00
Haelwenn 41f2ee69a8 Merge branch 'from/upstream-develop/tusooa/backup-status' into 'develop'
Detail backup states

Closes #3024

See merge request pleroma/pleroma!3809
2023-06-27 12:08:11 +00:00
Mark Felder ffee478ed0 Move websocket config for Shoutbox to the Endpoint
This is the modern way of configuring it
2023-05-31 15:30:58 -04:00
Mark Felder a7e7db4a29 Phoenix.Endpoint.Cowboy2Handler -> Plug.Cowboy.Handler 2023-05-31 13:48:16 -04:00
duponin 0231a09310 Remove SSH/BBS feature from core
And link to sshocial, the replacement client for this removed feature
2023-04-23 10:47:07 +02:00
tusooa 179efd9467
Make backup parameters configurable 2022-12-24 00:20:25 -05:00
Ekaterina Vaartis 398141da68 Merge remote-tracking branch 'upstream/develop' into meilisearch 2022-12-20 21:00:07 +03:00
Sean King 60df2d8a97
Merge branch 'develop' of git.pleroma.social:pleroma/pleroma into fine_grained_moderation_privileges 2022-12-18 22:03:48 -07:00
tusooa 1b0e47b79b Merge branch 'from/upstream-develop/tusooa/no-strip-report' into 'develop'
Give admin the choice to not strip reported statuses

Closes #2887

See merge request pleroma/pleroma!3773
2022-11-12 17:55:50 +00:00
Mark Felder 6b87b3f2ea Remove Quack logging backend 2022-11-11 12:36:29 -05:00
marcin mikołajczak eb70676931 Update links to Soapbox
Signed-off-by: marcin mikołajczak <git@mkljczk.pl>
2022-11-11 12:13:30 +01:00
tusooa 6f047cc308
Do not strip reported statuses when configured not to 2022-11-09 22:36:57 -05:00
Alexander Strizhakov 4121bca895 expanding WebFinger 2022-11-03 09:48:24 -04:00
Ekaterina Vaartis fd2cfc80d2 Change search_indexing = 10 and retries for indexing = 2 2022-10-10 20:19:09 +03:00
Ekaterina Vaartis 2bc21c6f18 Use oban for search indexing 2022-10-10 20:19:09 +03:00
Ekaterina Vaartis 3179ed0921 Make chunk size configurable 2022-10-10 20:19:09 +03:00