Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Styling a thread #1

Open
floriandierickx opened this issue Jun 26, 2019 · 8 comments
Open

Styling a thread #1

floriandierickx opened this issue Jun 26, 2019 · 8 comments

Comments

@floriandierickx
Copy link

Dear,

Thanks a lot for your application! To integrate a thread, I am very interested to use your tool to convert a twitter thread to a blogpost for my website.

I tried to convert a thread by running following commands:

thrd <- get_thread("1143480292830842880")
thrd
htmltools::HTML(thrd$thread_html)

Which captures the thread and converts it to html.

But when trying to style the thread using the built-in css with style_tweet(thrd), it renders the error

Error in x[1, ] : incorrect number of dimensions
Calls: &lt;Anonymous&gt; ... style_tweet -&gt; %&gt;% -&gt; eval -&gt; eval -&gt; htmlify_tweet
Execution halted

Any idea how to overcome this? Maybe I missed some step in the workflow..

Best regards

@hrbrmstr
Copy link
Owner

Thx for the kind words and kicking the tyres on the pkg. I'll work on a thread styler, but in the meantime this will work:

HTML("
<style>
.tweet-div {
  font-family: 'Helvetica Neue', sans-serif; 
  font-weight: 300; 
  font-size: 1em; 
  line-height: 1.3em; 
  border: 0.5px solid rgba(27, 40, 54, 0.5);; 
  border-radius: 5px; 
  width: 494px; 
  display: block; 
  padding: 12px 6px 12px 6px; 
  margin-bottom: 4pt; 
}

a.tweet-lnk { 
  font-size: 0.85em; 
  line-height: 1.3em; 
  text-decoration: none; 
  color:rgb(29, 161, 242) 
}

img.tweet-img { 
  display: block; 
  max-width:100%; 
  border: 0.25px dotted black; 
  margin-top:12px; 
}

span.tweet-hashtag { color:rgb(29, 161, 242) }
div.tweet-intro { margin-bottom:6px }
span.tweet-mention { color:rgb(29, 161, 242) }
span.tweet-source { font-size: 1.1em; font-weight: 700 }
span.tweet-ts { font-size: 0.75em; line-height: 1.3em; color=#2b2b2b }
</style>
", paste0(thrd$thread_html, collapse = "\n")
) %>% 
  cat(file = "thread-style-ex.html")

(essentially supplying the light' CSS from the styler function and the tweet HTML together).

^^ makes thread-style-ex.html: https://rud.is/dl/thread-style-ex.html

image

@floriandierickx
Copy link
Author

Cool, thanks a lot for this fast reply! Nice to see there is an alternative to the proprietary ThreadReader.. If I have some time I might also look into it and contribute (although I'm much slower then you are I think :p ). Your script makes it much better then the twitter thread embed (which can be forced for example with this script). Some brainstorming for further development (note to self, or anybody having time and energy to do it) : leaving out the author in subsequent tweets to enhance readability. And really cool would be if there is an option to include a kind of 'table of contents' to help people navigate through the tread by directing towards some parts, but maybe that's way too crazy for a thread :)

@hrbrmstr
Copy link
Owner

Def gd ideas. FWIW having the components means you can do pretty much anything in the interim. E.g.

library(magrittr)

sapply(seq_along(thrd$tweet_thread$text), function(.x) {
  
  # get tweet paragraph
  out <- sprintf("<p class='tweet-para' style = 'font-family: sans-serif'>%s</p>", thrd$tweet_thread$text[[.x]])
  
  # tack on images at the end if any
  if (!is.na(thrd$tweet_thread[["ext_media_url"]][[.x]][[1]])) {
    out <- c(out, sprintf("\n\n<img class='twimg' style='max-width: 100%%' src='%s'/>\n", thrd$tweet_thread[["ext_media_url"]][[.x]]))
  }
  
  # not necessary but makes nicer plaintext reading
  out <- c(out, "\n\n")
  
  paste0(out, collapse = "")

}) %>% 
  paste0(collapse = "") %>% 
  HTML() -> hthrd

div(hthrd, class = "thrd", style = "width:600px") %>% 
  html_print()

image

@floriandierickx
Copy link
Author

Hey @hrbrmstr, thanks a lot for your insight and help! I managed to get both of your scripts working and integrated them on my website. Much much better now :) [although I admit that the layout and css of the site is not the best in the world :p] Thanks a lot!

thrd <- get_thread("1097822801485025281")
### Def gd ideas. FWIW having the components means you can do pretty much anything in the interim. E.g. [https://github.com/hrbrmstr/tweetview/issues/1]
library(magrittr)
sapply(seq_along(thrd$tweet_thread$text), function(.x) {

  # get tweet paragraph
  out <- sprintf("%s", thrd$tweet_thread$text[[.x]])

  # tack on images at the end if any
  if (!is.na(thrd$tweet_thread[["ext_media_url"]][[.x]][[1]])) {
    out <- c(out, sprintf("\n\n<img class='twimg' style='max-width: 100%%' src='%s'/>\n", thrd$tweet_thread[["ext_media_url"]][[.x]]))
  }

  # not necessary but makes nicer plaintext reading
  out <- c(out, "\n\n")
  
  paste0(out, collapse = "")
}) %>%
  paste0(collapse = "") %>%

### Save to html-file
htmltools::HTML() -> hthrd
txt <- paste0(hthrd, collapse = "\n")
writeLines(txt, "thread.html")

@floriandierickx
Copy link
Author

floriandierickx commented Jul 1, 2019

By the way, by integrating it in a markdown jekyll site page [for example: https://github.com/floriandierickx/floriandierickx.github.io/blob/master/_posts/2019-02-18-eu-ets-en.md], it was quite straightforward to also get the table of contents integrated with the post [https://floriandierickx.github.io/blog/2019/02/18/eu-ets]

@floriandierickx
Copy link
Author

A last bit: would it be possible to still convert the hyperlinks, #s and @s in the text, to conserve hyperlinks and formatting in the blogpost? I tried to merge things together and hoped to get through with it, but until now I get too much errors.. I thought to mix:

[1] - current working code to convert to blogpost-html

thrd <- get_thread("1097822801485025281")
### Def gd ideas. FWIW having the components means you can do pretty much anything in the interim. E.g. [https://github.com/hrbrmstr/tweetview/issues/1]
library(magrittr)
sapply(seq_along(thrd$tweet_thread$text), function(.x) {

  # get tweet paragraph
  out <- sprintf("%s", thrd$tweet_thread$text[[.x]])

  # tack on images at the end if any
  if (!is.na(thrd$tweet_thread[["ext_media_url"]][[.x]][[1]])) {
    out <- c(out, sprintf("\n\n<img class='twimg' style='max-width: 100%%' src='%s'/>\n", thrd$tweet_thread[["ext_media_url"]][[.x]]))
  }

  # not necessary but makes nicer plaintext reading
  out <- c(out, "\n\n")
  
  paste0(out, collapse = "")
}) %>%
  paste0(collapse = "") %>%

### Save to html-file
htmltools::HTML() -> hthrd
txt <- paste0(hthrd, collapse = "\n")
writeLines(txt, "thread.html")

with [2] a part from htmlify-tweet.R that is used to wrap hashtags, users and links

htmlify_tweet <- function(x, avatar=FALSE, images=FALSE) {

  url_pattern <- "(http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\\(\\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+)"

  txt <- x$text

  hashtags <- unlist(x$hashtags)
  if (!is.na(hashtags[1])) {
    for (h in hashtags) {
      h <- sprintf("#%s", h)
      txt <- stringi::stri_replace_all_fixed(txt, h, sprintf('<span class="tweet-hashtag">%s</span>', h))
    }
  }

  ppl <- unlist(x$mentions_screen_name)
  if (!is.na(ppl[1])) {
    for (p in ppl) {
      p <- sprintf("@%s", p)
      txt <- stringi::stri_replace_all_fixed(txt, p, sprintf('<span class="tweet-mention">%s</span>', p))
    }
  }

  urls <- unlist(x$urls_t.co)
  if (!is.na(urls[1])) {
    xpu <- unlist(x$urls_expanded_url)
    for (i in 1:length(urls)) {
      txt <- stringi::stri_replace_all_fixed(txt, urls[i], xpu[i])
    }
  }

  txt <- stringi::stri_replace_all_regex(txt, url_pattern, '<a class="tweet-lnk" href="$1">$1</a>')

I tried to insert part [2] between # get tweet paragraph and # tack on images at the end if any from part [1] to get something consistent and working, but with no luck until now..

@floriandierickx
Copy link
Author

I got a code working, most of the time. Some of the tweets in some threads are delivered as 'NA', but in some threads everything comes out perfectly. Wondering where the error is..

library(tweetview)
library(tidyverse)
thrd <- get_thread("1067393840678674433")

### Development discussion: https://github.com/hrbrmstr/tweetview/issues/1

library(magrittr)

#preparatory conversions (hashtags, people mentions, urls: shortened and full)
hashtags <- unlist(thrd$tweet_thread$hashtags)
hashtags <- unique(hashtags)
ppl <- unlist(thrd$tweet_thread$mentions_screen_name)
ppl <- unique(ppl)
url_pattern <- "(http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\\(\\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+)"
urls <- unlist(thrd$tweet_thread$urls_t.co)
urls <- unique(urls)
xpu <- unlist(thrd$tweet_thread$urls_expanded_url)
xpu <- unique(xpu)

#convert tread-text to markdown + html (hashtags, mentions and urls)
sapply(seq_along(thrd$tweet_thread$text), function(.x) {

  # get tweet paragraph
  out <- sprintf("%s", thrd$tweet_thread$text[[.x]])
  
  # wrap hashtags in html
  if (!is.na(thrd$tweet_thread$hashtags[.x])) {
    for (h in hashtags) {
      h <- sprintf("#%s", h)
      out <- stringi::stri_replace_all_fixed(out, h, sprintf('<span class="tweet-hashtag">%s</span>', h))
    }
  }
  
  # wrap mentions in html
  if (!is.na(thrd$tweet_thread$mentions_screen_name[.x])) {
    for (p in ppl) {
      p <- sprintf("@%s", p)
      out <- stringi::stri_replace_all_fixed(out, p, sprintf('<span class="tweet-mention">%s</span>', p))
    }
  }
  
  # replace shortened urls with full urls
  if (!is.na(thrd$tweet_thread$urls_t.co[.x])) {
    for (i in 1:length(urls)) {
      out <- stringi::stri_replace_all_fixed(out, urls[i], xpu[i])
    }
  }
  
  # wrap urls in html
  out <- stringi::stri_replace_all_regex(out, url_pattern, '<a class="tweet-lnk" href="$1" target="_blank">$1</a>')

  # tack on images at the end if any
  if (!is.na(thrd$tweet_thread[["ext_media_url"]][[.x]][[1]])) {
    out <- c(out, sprintf("\n\n<img class='twimg' style='max-width: 100%%' src='%s'/>\n", thrd$tweet_thread[["ext_media_url"]][[.x]]))
  }

  # not necessary but makes nicer plaintext reading
  out <- c(out, "\n\n")

  paste0(out, collapse = "")
}) %>%
  paste0(collapse = "") %>%

### Save to html-file

htmltools::HTML() -> hthrd
txt <- paste0(hthrd, collapse = "\n")
writeLines(txt, "thread.html")

@floriandierickx
Copy link
Author

A post-scriptum on this : the problem seems to situate in the # replace shortened urls with full urls, so I commented that out and now it works perfectly, rolling out on https://floriandierickx.github.io/blog/

Thanks again for your tool! Really nice to be able to automate these things and create a blogpost out of threads.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants