Commit Graph

16142 Commits

Author SHA1 Message Date
Mark Felder ede414094f RichMedia refactor
Rich Media parsing was previously handled on-demand with a 2 second HTTP request timeout and retained only in Cachex. Every time a Pleroma instance is restarted it will have to request and parse the data for each status with a URL detected. When fetching a batch of statuses they were processed in parallel to attempt to keep the maximum latency at 2 seconds, but often resulted in a timeline appearing to hang during loading due to a URL that could not be successfully reached. URLs which had images links that expire (Amazon AWS) were parsed and inserted with a TTL to ensure the image link would not break.

Rich Media data is now cached in the database and fetched asynchronously. Cachex is used as a read-through cache. When the data becomes available we stream an update to the clients. If the result is returned quickly the experience is almost seamless. Activities were already processed for their Rich Media data during ingestion to warm the cache, so users should not normally encounter the asynchronous loading of the Rich Media data.

Implementation notes:

- The async worker is a Task with a globally unique process name to prevent duplicate processing of the same URL
- The Task will attempt to fetch the data 3 times with increasing sleep time between attempts
- The HTTP request obeys the default HTTP request timeout value instead of 2 seconds
- URLs that cannot be successfully parsed due to an unexpected error receives a negative cache entry for 15 minutes
- URLs that fail with an expected error will receive a negative cache with no TTL
- Activities that have no detected URLs insert a nil value in the Cachex :scrubber_cache so we do not repeat parsing the object content with Floki every time the activity is rendered
- Expiring image URLs are handled with an Oban job
- There is no automatic cleanup of the Rich Media data in the database, but it is safe to delete at any time
- The post draft/preview feature makes the URL processing synchronous so the rendered post preview will have an accurate rendering

Overall performance of timelines and creating new posts which contain URLs is greatly improved.
2024-05-07 19:54:56 -04:00
feld b42963a52c Merge branch 'revert-50af909c' into 'develop'
Revert "Merge branch 'pleroma-card-image-description' into 'develop'"

See merge request pleroma/pleroma!4107
2024-05-07 23:21:37 +00:00
feld 750fb25f48 Revert "Merge branch 'pleroma-card-image-description' into 'develop'"
This reverts merge request !4101
2024-05-07 23:20:38 +00:00
Mark Felder acf73f7e13 Update changelog entry 2024-05-07 17:48:40 -04:00
Mark Felder 06c26bf9c9 Add the absent max_featured_tags to the api spec for /api/v1/instance 2024-05-07 17:46:05 -04:00
Mark Felder b979389958 Add configuration[accounts][max_pinned_statuses] to /api/v2/instance
Also add the absent max_featured_tags to the api spec for /api/v2/instance
2024-05-07 17:45:02 -04:00
Mark Felder 3cad57bf48 Add configuration[statuses][characters_reserved_per_url] to /api/v2/instance
Fixes #3250
2024-05-07 17:25:30 -04:00
Mark Felder dd03184811 Strip actor from objects before federating 2024-05-07 11:54:45 -04:00
lain ffa6805c09 Merge branch 'description-type' into 'develop'
Fix type in config description

See merge request pleroma/pleroma!4104
2024-05-07 11:39:48 +00:00
Moon Man 5d4913bb93 Merge remote-tracking branch 'origin/rich-media-db' into spc2 2024-05-05 13:06:01 -05:00
Mark Felder 859ad4dbae Fix broken Rich Media parsing when the image URL is a relative path 2024-05-05 13:51:13 -04:00
Mark Felder b067fbde31 Respect the TTL returned in OpenGraph tags 2024-05-05 13:51:13 -04:00
Mark Felder 68dc81b59e Fix broken tests 2024-05-05 13:51:13 -04:00
Mark Felder 2079e92c5c Increase the :max_body for Rich Media to 5MB
Websites are increasingly getting more bloated with tricks like inlining content (e.g., CNN.com) which puts pages at or above 5MB. This value may still be too low.
2024-05-05 13:51:13 -04:00
Mark Felder a6407f9ba5 RichMedia refactor
Rich Media parsing was previously handled on-demand with a 2 second HTTP request timeout and retained only in Cachex. Every time a Pleroma instance is restarted it will have to request and parse the data for each status with a URL detected. When fetching a batch of statuses they were processed in parallel to attempt to keep the maximum latency at 2 seconds, but often resulted in a timeline appearing to hang during loading due to a URL that could not be successfully reached. URLs which had images links that expire (Amazon AWS) were parsed and inserted with a TTL to ensure the image link would not break.

Rich Media data is now cached in the database and fetched asynchronously. Cachex is used as a read-through cache. When the data becomes available we stream an update to the clients. If the result is returned quickly the experience is almost seamless. Activities were already processed for their Rich Media data during ingestion to warm the cache, so users should not normally encounter the asynchronous loading of the Rich Media data.

Implementation notes:

- The async worker is a Task with a globally unique process name to prevent duplicate processing of the same URL
- The Task will attempt to fetch the data 3 times with increasing sleep time between attempts
- The HTTP request obeys the default HTTP request timeout value instead of 2 seconds
- URLs that cannot be successfully parsed due to an unexpected error receives a negative cache entry for 15 minutes
- URLs that fail with an expected error will receive a negative cache with no TTL
- Activities that have no detected URLs insert a nil value in the Cachex :scrubber_cache so we do not repeat parsing the object content with Floki every time the activity is rendered
- Expiring image URLs are handled with an Oban job
- There is no automatic cleanup of the Rich Media data in the database, but it is safe to delete at any time
- The post draft/preview feature makes the URL processing synchronous so the rendered post preview will have an accurate rendering

Overall performance of timelines and creating new posts which contain URLs is greatly improved.
2024-05-05 13:51:13 -04:00
marcin mikołajczak 637f5bc431 Fix type in description
Signed-off-by: marcin mikołajczak <git@mkljczk.pl>
2024-04-27 20:29:23 +02:00
Haelwenn 88412daf11 Apply @lanodan's suggestion
Signed-off-by: marcin mikołajczak <git@mkljczk.pl>
2024-04-25 12:34:12 +02:00
lain 50af909c01 Merge branch 'pleroma-card-image-description' into 'develop'
Include image description in status media cards

See merge request pleroma/pleroma!4101
2024-04-19 07:39:05 +00:00
marcin mikołajczak 6f6bede900 Include image description in status media cards
Signed-off-by: marcin mikołajczak <git@mkljczk.pl>
2024-04-19 10:20:31 +04:00
lain 87b8ac3ce6 Merge branch 'receiverworker-error-handling' into 'develop'
ReceiverWorker: Make sure non-{:ok, _} is returned as {:error, …}

See merge request pleroma/pleroma!4100
2024-04-19 06:04:44 +00:00
Haelwenn 71a0373232 Merge branch 'ffmpeg-limiter' into 'develop'
Prevent Media Helper from respawning ffmpeg for bad media

See merge request pleroma/pleroma!4086
2024-04-17 05:47:54 +00:00
Haelwenn (lanodan) Monnier a299ddb10e
ReceiverWorker: Make sure non-{:ok, _} is returned as {:error, …}
Otherwise an error like `{:signature, {:error, {:error, :not_found}}}` ends up considered a success.
2024-04-17 07:43:47 +02:00
tusooa d80e0d6873 Merge branch 'user-actor-webfinger' into 'develop'
FEP-2c59, add "webfinger" to user actor

See merge request pleroma/pleroma!4099
2024-04-12 03:09:37 +00:00
marcin mikołajczak 4f5c4d79c4 FEP-2c59, add "webfinger" to user actor
Signed-off-by: marcin mikołajczak <git@mkljczk.pl>
2024-04-11 17:50:11 +02:00
marcin mikołajczak ccc3ac241f Add hint to rules
Signed-off-by: marcin mikołajczak <git@mkljczk.pl>
2024-04-06 11:45:19 +02:00
marcin mikołajczak 9e6cf45906 /api/v1/accounts/familiar_followers
Signed-off-by: marcin mikołajczak <git@mkljczk.pl>
2024-04-06 11:43:56 +02:00
marcin mikołajczak 01a5f839c5 Merge remote-tracking branch 'origin/develop' into instance_rules 2024-04-06 10:42:23 +02:00
Moon Man b3c46387fa Merge remote-tracking branch 'origin/develop' into spc2 2024-03-23 11:49:56 -05:00
Moon Man b6344879d6 Merge remote-tracking branch 'origin/logger-metadata' into spc2 2024-03-23 11:49:46 -05:00
lain 987f44d811 Merge branch 'bookmark-folders' into 'develop'
Fix BookmarkFolderView, add test

See merge request pleroma/pleroma!4096
2024-03-20 13:26:47 +00:00
marcin mikołajczak 37ec645ff2 Fix BookmarkFolderView, add test
Signed-off-by: marcin mikołajczak <git@mkljczk.pl>
2024-03-20 13:24:43 +01:00
Mark Felder 462d5aa5cb logger: remove request_id metadata which is not useful 2024-03-19 20:53:40 -04:00
Mark Felder 99cee755d8 Show Logger metadata in dev 2024-03-19 12:15:10 -04:00
Mark Felder 40823462e7 Logger metadata for request path and authenticated user 2024-03-19 12:15:10 -04:00
Mark Felder 7dfd148ff8 Logger metadata for inbound federation requests 2024-03-19 12:15:10 -04:00
Mark Felder 741f22bfe0 MediaHelper: cache failed URLs for 15 minutes to prevent excessive retries 2024-03-19 12:14:03 -04:00
Mark Felder c25fda34e7 Skip generating notifications for internal users 2024-03-19 12:11:30 -04:00
Mark Felder 291d531e4c Unify notification push and streaming events for both local and federated activities
This also removes generation of notifications for blocked/filtered/muted users and threads.
2024-03-19 12:11:30 -04:00
lain f775a1931b Merge branch 'transient-validators-defaults' into 'develop'
Set defaults values on transient objects (attachment, poll options) validators

See merge request pleroma/pleroma!4090
2024-03-19 12:44:13 +00:00
Lain Soykaf 4e8a1b40cb Merge branch 'develop' of git.pleroma.social:pleroma/pleroma into transient-validators-defaults 2024-03-19 16:26:02 +04:00
lain 8a14fdbe47 Update transient-validators-defaults.change 2024-03-19 12:03:43 +00:00
lain 4e37cd85ef Merge branch 'fix-bookmark-test' into 'develop'
CI: Move changelog check to later in the pipeline

See merge request pleroma/pleroma!4095
2024-03-19 12:02:10 +00:00
Lain Soykaf 040a980277 Add changelog 2024-03-19 15:03:16 +04:00
Lain Soykaf afae3a94a4 CI: Move changelog check to later in the pipeline
No reason to not run tests.
2024-03-19 13:54:35 +04:00
Lain Soykaf 9617189e96 Tests: Actually run the bookmark folder tests. 2024-03-19 13:51:04 +04:00
lain 8e37f19883 Merge branch 'test-improvements' into 'develop'
Tests: Explicitly set db pool size and max cases to the same value.

See merge request pleroma/pleroma!4094
2024-03-19 07:44:05 +00:00
Lain Soykaf 665947ab2a Tests: Reduced the max case number to make tests more stable. 2024-03-19 11:03:05 +04:00
Lain Soykaf 3cc8414c2e Add changelog 2024-03-19 10:38:29 +04:00
Lain Soykaf 923803a533 Tests: Explicitly set db pool size and max cases to the same value. 2024-03-19 10:34:37 +04:00
lain ca5766c0a7 Merge branch 'postgres-bump' into 'develop'
Update minimum Postgres version to 11.0; disable JIT

See merge request pleroma/pleroma!4093
2024-03-19 04:46:40 +00:00