Commit Graph

11 Commits

Author SHA1 Message Date
Mark Felder 5a5a193877 Fix broken Rich Media parsing when the image URL is a relative path 2024-05-07 19:54:56 -04:00
Mark Felder d21aa1a77c Respect the TTL returned in OpenGraph tags 2024-05-07 19:54:56 -04:00
Mark Felder ede414094f RichMedia refactor
Rich Media parsing was previously handled on-demand with a 2 second HTTP request timeout and retained only in Cachex. Every time a Pleroma instance is restarted it will have to request and parse the data for each status with a URL detected. When fetching a batch of statuses they were processed in parallel to attempt to keep the maximum latency at 2 seconds, but often resulted in a timeline appearing to hang during loading due to a URL that could not be successfully reached. URLs which had images links that expire (Amazon AWS) were parsed and inserted with a TTL to ensure the image link would not break.

Rich Media data is now cached in the database and fetched asynchronously. Cachex is used as a read-through cache. When the data becomes available we stream an update to the clients. If the result is returned quickly the experience is almost seamless. Activities were already processed for their Rich Media data during ingestion to warm the cache, so users should not normally encounter the asynchronous loading of the Rich Media data.

Implementation notes:

- The async worker is a Task with a globally unique process name to prevent duplicate processing of the same URL
- The Task will attempt to fetch the data 3 times with increasing sleep time between attempts
- The HTTP request obeys the default HTTP request timeout value instead of 2 seconds
- URLs that cannot be successfully parsed due to an unexpected error receives a negative cache entry for 15 minutes
- URLs that fail with an expected error will receive a negative cache with no TTL
- Activities that have no detected URLs insert a nil value in the Cachex :scrubber_cache so we do not repeat parsing the object content with Floki every time the activity is rendered
- Expiring image URLs are handled with an Oban job
- There is no automatic cleanup of the Rich Media data in the database, but it is safe to delete at any time
- The post draft/preview feature makes the URL processing synchronous so the rendered post preview will have an accurate rendering

Overall performance of timelines and creating new posts which contain URLs is greatly improved.
2024-05-07 19:54:56 -04:00
Haelwenn 251c455b91 Merge branch 'deps-bump' into 'develop'
Bump dependencies

See merge request pleroma/pleroma!4044
2024-01-29 17:43:00 +00:00
Mark Felder 06b8923d42 RichMedia.Parser.TTL.AwsSignedUrl: dialyzer fix
lib/pleroma/web/rich_media/parser/ttl/aws_signed_url.ex:9:callback_type_mismatch
Type mismatch for @callback ttl/2 in Pleroma.Web.RichMedia.Parser.TTL behaviour.

Expected type:
nil | integer()

Actual type:
{:error, <<_::64, _::size(8)>>} | {:ok, integer()}
2024-01-26 17:37:32 -05:00
Mark Felder 5b95abaeea Credo.Check.Readability.PredicateFunctionNames
This check was recently improved in Credo and it does make sense for readability.

The offending functions in Pleroma have been renamed and a couple missing the ? suffix have been fixed as well.
2024-01-26 16:59:58 -05:00
lain e853cfe7c3 Revert "Merge branch 'copyright-bump' into 'develop'"
This reverts merge request !3825
2023-01-02 20:38:50 +00:00
marcin mikołajczak 10886eeaa2 Bump copyright year
Signed-off-by: marcin mikołajczak <git@mkljczk.pl>
2023-01-01 12:13:06 +01:00
Sean King 17aa3644be
Copyright bump for 2022 2022-02-25 23:11:42 -07:00
Haelwenn (lanodan) Monnier c4439c630f
Bump Copyright to 2021
grep -rl '# Copyright © .* Pleroma' * | xargs sed -i 's;Copyright © .* Pleroma .*;Copyright © 2017-2021 Pleroma Authors <https://pleroma.social/>;'
2021-01-13 07:49:50 +01:00
Alexander Strizhakov 103f3dcb9e
rich media parser ttl files consistency 2020-10-13 16:38:15 +03:00