Commit Graph

117 Commits

Author SHA1 Message Date
Alexander Strizhakov 514c899275
adding gun adapter 2020-02-18 08:19:01 +03:00
rinpatch 472132215e Use floki's new APIs for parsing fragments 2020-02-16 01:55:26 +03:00
feld 237b2068f9 Revert "Merge branch 'feat/floki-fasthtml' into 'develop'"
This reverts merge request !2194
2020-02-11 16:55:18 +00:00
rinpatch ea1631d7e6 Make Floki use fast_html 2020-02-11 16:17:21 +03:00
Alexander Strizhakov 1f4fbe9d98
title parse improvement 2020-01-29 11:13:34 +03:00
Alexander Strizhakov 7bd4c14581
meta tag parser respect first title header 2020-01-28 19:29:27 +03:00
Maksim Pechnikov b4cf74c106 added prepare html for RichMedia.Parser 2019-09-15 14:53:58 +03:00
Maksim 139b196bc0 [#1150] fixed parser TwitterCard 2019-08-06 20:19:28 +00:00
Ariadne Conill b93498eb52 constants: add as_public constant and use it everywhere 2019-07-29 02:43:19 +00:00
Ariadne Conill d3bdb8e704 rich media: parser: splice the given URL into the result 2019-07-23 23:51:29 +00:00
rinpatch 3368174785 Fix rich media parser failing when no TTL can be found by image TTL
setters
2019-07-21 18:22:22 +03:00
Sachin Joshi de9906ad56 change the structure of image ttl parsar 2019-07-19 11:43:42 +05:45
Sachin Joshi 18234cc44e add the rich media ttl based on image exp time 2019-07-17 00:20:34 +05:45
Alex S f4447d82b8 parsers configurable 2019-07-14 09:21:56 +03:00
feld 93a0eeab16 Add license/copyright to all project files 2019-07-10 05:13:23 +00:00
Maksim Pechnikov 5c0f646cef fix validate_page_url 2019-06-26 06:27:17 +03:00
Maksim Pechnikov 4ad15ad2a9 add ignore hosts and TLDs for rich_media 2019-06-25 22:25:37 +03:00
Maksim Pechnikov 0276cf5a02 fix validate_url for private ip 2019-06-25 17:44:24 +03:00
lain 0e415921cd Rich Media Parser: Do not return just a title if nothing else is there. 2019-06-22 16:22:59 +02:00
lain 58c4d5312b Revert "Revert "Merge branch 'fix/ogp-title' into 'develop'""
This reverts commit b6af80f769.
2019-06-22 15:12:57 +02:00
feld b6af80f769 Revert "Merge branch 'fix/ogp-title' into 'develop'"
This reverts merge request !1277
2019-06-21 11:36:32 +00:00
rinpatch f30a3241d2 Deps: Update auto_linker 2019-06-18 16:08:18 +03:00
Egor Kislitsyn a12f8e13c8 Improve <title> fallback; Add a test 2019-06-13 15:02:46 +07:00
Mark Felder 7363a0ea8a Revert "Only run Floki if title is missing from the map"
This reverts commit 97d2b1a45a.
2019-06-12 18:32:28 -05:00
Mark Felder 97d2b1a45a Only run Floki if title is missing from the map 2019-06-12 18:27:35 -05:00
Mark Felder 097fdf6a5d Attempt to use <title> from HTML as a fallback 2019-06-12 17:56:51 -05:00
Egor Kislitsyn bf22ed5fbd Update `auto_linker` dependency 2019-06-12 15:53:33 +07:00
rinpatch 92213fb87c Replace Mix.env with Pleroma.Config.get(:env)
Mix.env/0 is not availible in release environments such as distillery or
elixir's built-in releases.
2019-06-06 23:59:51 +03:00
Sergey Suprunenko 1690be991e Replace missing non-nullable Card attributes with empty strings 2019-05-30 21:03:31 +00:00
William Pitcock 0da1233e8e rich media: suppress link previews if post is marked as sensitive 2019-05-17 18:49:43 +00:00
William Pitcock 57d11ac9db activitypub: move post rich media fetching to job queue 2019-05-13 19:36:00 +00:00
Roman Chvanikov 4615e56219 Add `with_body: true` to requests relying on `max_body: val` 2019-04-12 00:16:33 +07:00
William Pitcock c62220c500 rich media: helpers: only crawl Create activities 2019-03-23 02:28:59 +00:00
William Pitcock b3bf523c09 rich media: use optimized Object.normalize() 2019-03-23 00:22:57 +00:00
Haelwenn (lanodan) Monnier a3a9cec483
[Credo] fix Credo.Check.Readability.AliasOrder 2019-03-13 04:26:54 +01:00
William Pitcock 19afd9f81f http: rework connection timeouts to match hackney docs, enforce 1 second max TCP connection timeout 2019-03-08 22:56:16 +00:00
William Pitcock b7aa1ea9e6 rich media: helpers: rework validate_page_url() 2019-03-04 18:39:13 +00:00
William Pitcock 9f3cb38012 helpers: use AutoLinker to validate URIs as well as the other tests 2019-03-04 18:31:49 +00:00
William Pitcock d38d537bee rich media: don't crawl bogus URIs 2019-03-04 18:31:49 +00:00
William Pitcock 45e57dd187 rich media: tighten fetching timeouts and size limits 2019-02-10 21:54:08 +00:00
Haelwenn (lanodan) Monnier 6a6a5b3251
de-group alias/es 2019-02-09 16:31:17 +01:00
William Pitcock d83dbd9070 rich media: parser: reject any data which cannot be explicitly encoded into JSON 2019-02-05 20:50:57 +00:00
lain b19b4f8537 Remove default value for rich media.
Setting it to true will actually override a 'false' set before.
2019-01-31 20:02:08 +01:00
lambda 44913c1019 Merge branch 'bugfix/rich-media-non-unicode' into 'develop'
rich media non-unicode bugfix

See merge request pleroma/pleroma!749
2019-01-31 16:54:48 +00:00
William Pitcock 46dba03098 rich media: parser: only try to validate strings, not numbers (OEmbed) 2019-01-31 16:19:31 +00:00
William Pitcock dafb6f0b5e rich media: parser: reject OGP fields we cannot safely process 2019-01-31 16:03:56 +00:00
rinpatch 7057891db6 Make rich media support toggleable 2019-01-31 18:18:20 +03:00
href 5ea0397e2d
Fix 4aff4efa typos 2019-01-30 21:08:41 +01:00
href 4aff4efa8d
Use multiple hackney pools
* federation (ap, salmon)
* media (rich media, media proxy)
* upload (uploader proxy)

Each "part" will stop fighting others ones -- a huge federation outbound
could before make the media proxy fail to checkout a connection in time.

splitted media and uploaded media for the good reason than an upload
pool will have all connections to the same host (the uploader upstream).
it also has a longer default retention period for connections.
2019-01-30 15:06:46 +01:00
William Pitcock 61d6715714 rich media: oembed: return data in the same format as the other parsers 2019-01-28 21:13:25 +00:00
William Pitcock ddb5545202 rich media: kill some testsuite noise 2019-01-28 20:55:33 +00:00
William Pitcock 0f11254a06 rich media: parser: add some basic sanity checks on the returned data with pattern matching 2019-01-28 20:43:21 +00:00
William Pitcock 83b7062634 rich media: parser: cache negatives 2019-01-28 20:19:07 +00:00
William Pitcock 8fb16e9f0f rich media: parser: add copyright header 2019-01-28 20:00:01 +00:00
William Pitcock ebeabdcc72 rich media: helpers: clean up unused aliases 2019-01-28 06:10:25 +00:00
William Pitcock 8e42251e06 rich media: add helpers module, use instead of MastodonAPI module 2019-01-28 06:04:54 +00:00
William Pitcock 6096846f5f API: kill /api/rich_media/parse endpoint 2019-01-28 05:53:17 +00:00
William Pitcock de42646634 rich media: add try/rescue to ensure we catch parsing and fetching failures 2019-01-28 05:53:17 +00:00
William Pitcock 8f2f471e94 rich media: gracefully handle fetching nil URIs 2019-01-26 16:36:17 +00:00
Maxim Filippov b8a77c5d70 Add OEmbed parser 2019-01-13 02:06:50 +02:00
Maxim Filippov 1f851a0723 Add Twitter Card parser 2019-01-10 18:09:56 +00:00
rinpatch a2d7f0e0e9 Remove :commit since a tuple is already returned 2019-01-09 21:35:01 +03:00
William Pitcock 487c00d36d rich media: disable cachex in test mode 2019-01-04 23:53:26 +00:00
William Pitcock 0964c207eb rich media: use cachex to avoid flooding remote servers 2019-01-04 23:32:01 +00:00
Maxim Filippov 48e81d3d40 Add RichMediaController and tests 2019-01-02 17:02:50 +03:00
Maxim Filippov 917d48d09b Better variable name 2019-01-01 23:29:47 +03:00
Maxim Filippov 2aab4e03c3 Add OGP parser 2019-01-01 23:26:40 +03:00