2023-12-19T13:34:00Z

Rail and advertising

Illustration of a fast-moving train
Let's talk trains and advertising ! The idea is to analyse a few advertisements for trains around the world. But it's mainly a pretext for showing you how recent artificial intelligence technologies can be used to improve an article. 🤖

Table of contents

  1. A Japanese advert
  2. Der Volksgeist (The people's spirit)
  3. The French touch
  4. To sum up
  5. Let's get technical
    1. Whisper
      1. Japanese
      2. German and French
    2. Seamless
    3. Real-ESRGAN
  6. Conclusion

A Japanese advert

This short 15s clip was created by the Ponoc studio. The advert was commissioned by JR West, one of Japan's major railway companies.

In Japanese animation style, the story depicts a family going on holiday in western Japan. The focus is on the young boy, who is reluctant at first but gradually blossoms with his family.

His reticence contrasts with the family's excitement. He is quiet and not very mobile, whereas the rest of the family is moving around a lot and everyone is talking at the same time. His change of heart on this trip is reflected in his smile, which goes from an intense pout to bursts of joy when he discovers the sea.

The target audience is families in general, not children in particular. Here the advert is a very clever staging of holiday memories and nostalgia. There are many references to reality, such as the Shinkansen N700 series and the recognisable Shin-Ōsaka station. In particular, we see and hear a Japanese cicada, a true symbol of summer in Japan.

In short, this is an advert that gets a message across and speaks to its audience. 👏

Speaking of advertising that speaks to its audience ... Let's take a look at an advert from Germany.

Der Volksgeist (The people's spirit)

This mini-film advert was commissioned by Deutsch Bahn1 and produced by Pantera. It humorously depicts a young woman going on a business trip who is harass by her client.

The advert emphasises the tranquillity and comfort of travelling by train by contrasting it with the vigilance and stress inherent in driving a car.

The humour is typically German, based on the repetitive comedy of the insistent customer. The key elements of this humour are the repetition of the phrase "Frau Fischer" and the staging of the customer trying to reach Mrs Fischer in increasingly improbable ways.

In this absurd production, sound plays as important a role as image. The sound effects are very well chosen and are enhanced by the use of classical symphony-style background music, while supporting the story. In a more realistic interpretation of the story, Mrs Mueller (the customer) does not magically appear in Mrs Fischer's car. She is, in fact, a personification of Mrs Fischer's stress, a figment of her imagination. The disappearance of these hallucinations reflects the disappearance of Mrs Fischer's stress, as she is serene because she has taken the train.

I'll spare you a precise description of each shot, but every element is selected and few things are left to chance.

The advertisement is aimed at people travelling on business. A fairly small group of people. There's no question here of generally overshadowing the car, the heart of German industry. 🙄 The director⋅trice Pantera went on to make several ads for the Mercedez Benz car brand.

I invite you to have a look at the other creations of the BWGTBLD company.

The French touch

This mini-film made for the French SNCF group isn't really an advert. The aim is not to sell you something, but to work on the brand's image and promote the SNCF's new slogan "For all of us".

The video is based around a slam text, sung by Gaël Faye, and illustrated by a number of very short video extracts from a variety of sources.

The clip draws on a wide range of French cultural references and a number of previous SNCF promotional sequences, some of them are very old. These sequences, while appealing to the public's memory, are sufficiently unremarkable to make it difficult to remember exactly where they come from. The aim is to give way to the key element of this spot, the song.

Punctuated by a few piano notes 2, it describes the SNCF and its values. There are many puns and subtle figures of speech. In particular, the SNCF describes its commitments and its humanist nature.

All French people are invited to love the SNCF in this unifying video. 💕 Paradoxically, it brings people together by emphasising France's diversity, thus reviving one of its commitments, supporter la diversité.

As the video progresses, the description of the SNCF becomes more and more descriptive of the French people themselves, in particular through the use of the impersonal pronoun "on", which has been ambiguous since the beginning of the video, and which can refer to both the SNCF and the French people. This amalgam appeals to the patriotic pride of French citizens.

Now that the link has been created, it's the ideal time for SNCF to broach the subject of its faults and share them with its public, since its faults are the clichéd faults of the French. This empathy not only invites us to forgive them, but also reinforces the idea of a French company for the French, for France, "for all of us".

This last idea is particularly highlighted by the final shock line “On est pas carré, on est hexagonal” (“We're not square, we're hexagonal”). In a very French play on words, it turns the weaknesses of not been very rigorous (“To not be square”) into a quality of been french. 🇫🇷

To sum up

This study of the advertisements shows that these clips are culturally very rich. We can see that these groups do not hesitate to invest money to promote themselves, or at least that they have a lot of experience in their communications. The references are anchored in the culture of the country and I note that this contradicts the modern policies of opening up the rail market to competition, particularly from abroad. One could expect communication in English, with references based on [pop culture] (https://fr.wikipedia.org/wiki/Culture_populaire).

To finish on a funny note, here's a set of accident prevention videos from the Latvian company LDz.

Set of LDz video staging the characters of Avārijas Brigāde about rail safety.

Let's get technical

The video subtitles were created using tools based on LLM. The LDz video has been enlarged and is much more pleasant to watch than the original version.

To create this article, I used:

I will not write about yt-dlp and ffmpeg and concentrate myself on machine learning tools.

Whisper

Ok, so since the videos aren't necessarily in a language that readers of this blog speak, I would have liked to have subtitles for the videos.

I used Whisper to transcribe every video. I already have the subtitles of the SNCF ad, I will use that to verify the quality of Whisper, since french is my mother tongue.

I setup a container and install Whisper. The quality of the transcriptions vary wildly based on the language.

Japanese

In Japanese, I can hardly evaluate the result.
However, it seems to me that it is difficult to understand, the sentences are not very well linked and the general meaning is not clear.

I also used insanely-fast-whisper to transcribe Japanese.

I have to admit that the results are better and much faster than with the OpenAI's version, even if they are not at the level of Latin language processing.

German and French

To transcribe German it's ok3. However, the video contains cuted sentences that are poorly transcribed, and two short sentences at the end are also poorly transcribed. Using the medium-sized model, the results are better.

In French, there is only one transcription error3, but the spelling sometimes leaves something to be desired, the lyricism of the song certainly having something to do with it.

I've also tried whispercpp on the French and German audio. The results are very good and more or less identical to the OpenAI versions. Slightly less good overall.
I even tried a fine-tuned model for French. This corrects the transcription error that Whisper had, but creates another one!

So, after a few tweaks, I have the original subtitles for the videos! 🥳
Now I just need to translate them. 🎉

As it happens, whisper and whispercpp offer to translate directly into English (English only). However, I would also like to have the French version.

Why didn't you use other transcription methods?

That's an interesting question. There are many different methods of transcription. There is even a leadboard of these methods on paperswithcode.com.

It is difficult to know how usable these methods are in a real context, with noise and background. Also, some of theses methods are very resource-intensive. I tried using tevr-asr-tool for example, but it was very resource-hungry (⚠️) and not suitable for a noisy environment.

Seamless

The idea here is to translate the Whisper subtitles into 3 languages: the original language, French and English.

In the first version of this article, I used NLLB which is an automated learning model created by Meta and dedicated to textual translation.

Despite a quick translation, the results were disappointing.
Also, the use of the model itself was, in my opinion, too complex.

There is no CLI tool available with Seamless. You automatically have to use Python code. Fortunately, HuggingFace has everything you need.

import srt
from pathlib import Path
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, pipeline, SeamlessM4TModel, AutoProcessor, SeamlessM4TForTextToText

def translate(text, processor, model):
    text_inputs = processor(text = text, src_lang=langSrc, return_tensors="pt")
    output_tokens = model.generate(**text_inputs, tgt_lang=langTrg)

    return processor.decode(output_tokens[0].tolist(), skip_special_tokens=True)

fileToTranslate = "JR-West-ads-Summer_Train-ja.srt"
langSrc, langTrg  = "jpn", "fra"


# Must clone the Hugging Face repo with git lfs https://huggingface.co/facebook/hf-seamless-m4t-large
print("Loading processor")
processor = AutoProcessor.from_pretrained("hf-seamless-m4t-large")
print("Loading model")
model = SeamlessM4TForTextToText.from_pretrained("hf-seamless-m4t-large")


with open(fileToTranslate) as f:
    subs = list(filter(lambda x: x is not None, srt.parse(f.read())))

if subs == None or subs == []:
    print("No subs available")
    exit(0)

toTranslate = [x.content for x in subs]

for i in range(len(subs)):
    print(f"> {toTranslate[i]}")
    subs[i].content = translate(toTranslate[i], processor, model)
    print(f">> {subs[i].content}")

with open(f'{Path(fileToTranslate).stem}_{langTrg}.srt', 'w') as f:
    f.write(srt.compose(subs))

The overall result is stunning! 🤯

Once again for Japanese, it's very difficult to assess the relevance of the translation. Especially when you're basing it on a text that's certainly badly transcribed. On the other hand, for German and French, it's flawless! 💯

Real-ESRGAN

Here, it's going to be very simple, I followed the instructions in the README. First, I made sure I had Pytorch installed and that Python was at least version 3.7. Then I downloaded the recommended pre-trained model RealESRGAN_x4plus.pth. I'm actually rather surprised not to find more variants available on HuggingFace.

Then I enlarged each of the images in the initial video with this little script:

for i in $(seq 1 14706); do
        file=$(printf "png/%04d.png" $i)
        python3 inference_realesrgan.py -n RealESRGAN_x4plus -i $file -o train --fp32
        echo $file $(($i * 100 / 14706))
done

And them I encode the video with ffmpeg in AV1.

It seems that there is a model for video enhancement. However, I only found out about it after I'd started enhancing based on individual images.

I invite you to try out the video enhancement model for yourself and let me know how good is it by e-mail ;)

Conclusion

I will be quick, LLM-based technologies are performing well and are going to improve rapidly. 👌
They are now tools that can be used on a daily basis to produce quality documents. They are now available locally and from liberating software.

This last point is important, as we can see that these technologies revolve around the United States and the English language. The existence of open-source software in this field guarantees that other countries will be able to acquire these technologies without depending on large companies such as Meta, Microsoft or Google (Big Tech).

The American aegemony over AI tools is rivalled only by the BATX (Baidu, Alibaba, Tencent and Xiamoi). In such a context, it is not surprising that these tools are US-centric. While the default language of these tools is English (sometimes the only language available!), these models perform better in English than in the rest of the European languages. This is obviously a problem of fairness between peoples, and is in fact a competitive advantage for US companies.


To conclude, I'm going to describe how in 5 minutes, I animated the illustration for this article.

I already had a rough idea of what I wanted the animation to look like, a moving train, the idea of speed, etc. But I didn't really know what I wanted. Often in my animations, what limits me is the basic SVG illustration4. So I can spend a lot of time finding the right SVG to animate. Pour finir, je vais vous narrer comment en 5min, j'ai animé l'illustration de cet article.

Here, on the contrary, it was very quick. On iconbuddy, a free vector icon site, I searched for 'train' and came across this illustration created by IBM.

Bullet train illustrated on a railway

Using Inkscape, I separate the image into several components that I'm going to animate independently. My aim is to quickly check that the infinite rail effect I want to create is possible. To do this, I enlarge the rails and add animation using an animateTransform element.

  <animateTransform
    attributeName="transform"
    attributeType="XML"
    type="translate"
    from="0"
    to="-12"
    dur="0.15s"
    repeatCount="indefinite" />

Train animation with only the rails moving

It's perfect! I then animated the train. This time, for aesthetic reasons, I wanted a more complex animation where the train moves forward, slows down and then speeds up. So I turned to CSS keyframe animations with an 'ease-in-out' animation. You can add CSS directly to the SVG, which is very practical.

#path1 {
  animation: train 4s infinite;
  animation-timing-function: ease-in-out;
}

@keyframes train {
  0% {
    transform: translate(-40px);
  }
  30% {
    transform: translate(-5px);
  }
  53% {
    transform: translate(-15px);
  }
  100% {
    transform: translate(220px);
  }
}

Very quickly I have a convincing result which I present to my girlfriend.

She retorts in a slightly mocking tone: “It's not bad! Your train's a bit short though.”

Too short train

I quickly enlarged it, added two small strips to reflect the speed and that was it!

The article's logo: A train moving fast on rails, simplified style, side view.


  1. Which could be translated as “German rail network”.

  2. Although the music is simple, it is not outdone. The rhythm is modelled on and introduced by railway noise. Every snappy moment is accentuated by the SNCF's signature sonal which lends itself surprisingly well!

  3. Make sure you are using the latest version. Otherwise, support for French and German is mediocre. In fact, before version 3, "Apollinaire" systematically became "Napoléon", which is a lot less romantic. 2

  4. Of course, my lack of talent to do this type of illustration myself is also a problem.

fr