Category Archives: en

Article 13 of the Copyright Directive considered harmful

[this is a translation+partial update of my original post in French here]

The “directive on copyright in the Digital Single Market“,  in short “Copyright Directive”, having passed the JURI commission vote with amendments on 20 June  2018, will soon be voted in a plenary session of the European parliament, 5 July 2018.

I wrote the following text before calling some Members of the European Parliament (MEPs), thus participating in the campaign started by saveyourinternet.eu.

I would like to invite you to do the same, not before you have read some of the references quoted at the end of this page, and consulted https://juliareda.eu/2018/06/article-11-13-vote/

Two articles are especially dangerous.

  • Article 11, about referencing and quoting press articles; we will not develop this issue any further here.
  • Article 13, about so-called “upload filters” on all content sharing sites (ie all sites who have a function of sharing content, including comments/videos/photographs/audio on social networks).

The stated goal of article 13 is to protect rightholders of the entertainment industry against the hegemony of the big web sharing platforms, most notably Youtube, which alledgedly results in revenue “evasion” when rightholder’s contents are illegally uploaded and consulted on these platforms.

The proposed solution is to create a legal obligation to deploy system blacklisting protected contents, on all content sharing sites, for all types of content, even those that don’t need protection (for example, computer software source code).

We are going to examine how such systems work, why they are costly to implement, with significant collateral damage, and why the targeted platform already implement measures to satisfy the stated goal.

Content blacklist systems

They can be roughly classified in three categories :

“Exact match” detection

They are relatively cheap in terms of resources. They work on raw digital data. They don’t need to be aware of formats or media type, not even of the detailed original content to protect, thanks to the use of so-called “hashing” or “digest” algorithms.

These features make these systems very easy to implement and operate, and very cheap. The algorithms are free and open source software, or public domain (for the underlying mechanism), and they are easily adapted to any platform.

On the other hand, these systems are very easy to bypass, through minor changes in the protected file. In consequence, they constitute a very poor protection for rightholders.

Detection “by similarity”

These systems are much more sophisticated. They have a knowledge of media formats, and are able to extract characteristic elements, similar to a fingerprint of the protected content.

This process enables a much wider detection of the content, even heavily modified, for example a barely audible background sound in a family video ou amateur show.

The most famous system in this category is Content-Id, implemented by Youtube, described here by Google. A lot of comments on Article 13 refer to Content-Id as a model. Article 13 itself seems to have been written with Content-Id in mind.

Systems “by similarity” are very expensive to develop and implement. According the the Google video quoted above, Content-Id required an investment of over $100 million.

There are also no free and open source implementation of such systems, which makes it even more difficult to deploy: you need to develop a custom, in-house system, or acquire a license for an existing commercial system, if you find one.  The companies in a position to provide such specific services are rare.

Furthermore, the detection performance (false positive and false negative rates) of these systems is difficult to estimate. First, for the above mentioned reasons (proprietary systems with limited access), second, because the underlying technical processes are based on heuristics which stops them from being fully reliable.

Finally, these system present an important drawback: as explained by Google in the Content-Id presentation video, rightholders must provide the original content, or protected excerpts from the content, which is difficult to achieve on a wide scale (many works and many actors on both roles, rightholders and content sharing platforms).

“watermarking” systems

These systems are mentioned in the annex of the directive. They are only presented here for the sake of completeness. Their costs are comparable to those of similarity detection systems, but they are of limited scope, probably not reasonably usable in the context of Article 13.

Blacklist management

Black list management, independently from the above technical criteria, constitutes an issue in itself.

Article 13 does not really provide satisfactory solutions to the following issues:

  • false positive (over-blocking): blocking legitimate content.
    • erroneous blacklisting by an alleged rightholder
    • erroneous blocking of content protected by an exception (parody, memes, etc), but in which the blacklisting systems have identified protected content.
    • erroneous insertions in the blacklist for other reasons. This happened repeatedly, for example, in the French police DNS blocking systems, including by misconfigured test systems. See [FR] Google.fr bloqué pour apologie du terrorisme suite à une « erreur humaine » d’Orange.
  • false negative (under-blocking): not blocking illegitimate rightholder content. Content protection is difficult to implement, even on the rightholder side: many works have not even been digitalized by their legitimate rightholders.
  • adding new content to the blacklist may require manual, hence heavy, checks, to reduce false positives, but does not guarantee their elimination.
  • unwieldy and unreliable complaint mechanisms: all over-blocking and under-blocking issues have to be handled via human, or even judicial, intervention. But there are daily reports of abusive content removal here or there. For example, under the United States DCMA (Digital Millennium Copyright Act), some rightholders have been known to request content removal on works they didn’t own, by mere title similarity, or by claiming DMCA procedures to force removal of price lists in price comparators.
  • individuals and small companies are defenceless against abusive blocking of their content, if the site-internal reporting mechanism fails to address the issue in time. In most cases, action in court or even using an alternative dispute resolution system (13a) will be too expensive and too slow, resulting in a posteriori self-censorship.

Article 13 in its final redaction does not satisfactorily address these concerns, the last point above being the most worrisome.

The Content-Id system

Although Content-Id is owned by Google and Youtube-specific, it deserves a more thorough examination, as it seems to have been an implicit model for Article 13.

Content-Id is a “detection by similarity”. To use it, rightholders have to provide Youtube with the videos they wish to protect, or samples of these.

When a protected content is identified in a posted video, 3 options are available:

  • block the video
  • monetize the video (advertisement)
  • obtain traffic data, for example to know in which countries the video is popular.

According to Google, Content-Id has already enabled payment of several billions of dollars to rightholders, and the system includes hundreds of millions of videos.

Impact assessment of the directive

The summary of the impact assessment, as annexed to the project, is very incomplete: as compared to the full impact assessment study, it mentions only in part the impact for rightholders, limiting itself to a legal discussion in the digital single market. It doesn’t mention either the efficiency and technical feasibility of Article 13, or its consequences on Internet sites and the Internet ecosystem. It is advised to refer to the full impact assessment study.

1. Disappearance or marginalization of  contributive sites

Contributive sites based on free (Creative Commons, etc) content will not have the resources to exploit, not to mention develop or even rent/subscribe to systems similar to Content-Id.

The impact assessment study provides a real example of the subscribing costs to such a service: €900/month for a small site (5000 transactions/month, ie about €0.18/transaction; a transaction being a single check, needing to be executed for every post by a user).

The study only considers commercial sites where sharing is the main purpose. This fails to recognize the impact on high volume contributive sites, social networks, amateur or family photo sharing sites, classified advertisement, etc, for which there is no significant revenue stream as compared to the cost of monitoring posted content.

Most notably, social networks are targeted, as Article 2/4b of the directive excludes only 3 very specific types of sites from the requirements of Article 13.

  • services acting in a non-commercial purpose capacity such as online encyclopaedia
  • providers of cloud services for individual use which do not provided direct access to the public
  • open source software developing platforms
  • online market places whose main activity is the online retail of physical goods

As a consequence, this first impact on freedom of speech seems underevaluated.

2. All content types are targeted

Most content protection systems currently operated focus on contents from the entertainment industry:

  • videos and movies
  • music

On the other hand, Internet sharing applies to many other types of contents, for example photographs.

Again, the burden on Internet sites will be significant, with the same risks for abusive blocking, which also amplifies the consequences on the other listed issues.

3. Issues with respect to Freedom of Speech

As explained above and confirmed by many non-profit organizations, similarity detection systems are unable to differentiate illegal use from legal use such as a quote, a meme, a parody, etc.

It also happens frequently that works that are initially free of use are erroneously blacklisted, for example after being presented or quoted in protected TV shows or TV news.

In any case, content detection systems already result, when they are implemented, in abusive censorship. To force their generalization through the Directive can only be severely harmful to Freedom of Speech, especially on social networks, making it more difficult to exercise the above mentioned legal exceptions.

Finally, as explained, widening content detection systems to all types of contents can only make this risk more acute.

4. The proposed legal dispositions are inefficient to protect rightholders

As explained, similarity systems like Content-Id are not usable at global scale because of their cost, and exact match systems are easy to bypass.

Furthermore, similarity systems are already deployed on major sites, as explained by the impact assessment study:

In all, as content recognition technologies are already applied by the major user uploaded content services, it is likely that this option would not lead to significant increases in unjustified cases of prevented uploads compared to the current situation

In other words, Article 13 is not needed since the goals it seeks to achieve are already implemented where it matters.

5. The proposed dispositions may be harmful to cultural diversity

The impact assessment studies estimates that Article 13 will promote cultural diversity, which is assumed to be a natural byproduct of rightholder protection.

But Article 13 hampers the ability of contributive and/or non-profit sites, which without a doubt are also part of cultural diversity. Most of their contents are free of rights, hence with naturally maximized visibility and dissemination.

This is evidenced by Wikipedia’s statistics: 5th site in the world, according to the Alexa study. Furthermore, according to Wikimédia France: “platforms will prefer precaution by blocking more content than necessary, which will hamper their diversity, by preventing participation from people less accustomed to new technologies” (translated from « les plateformes opteront pour un principe de précaution en bloquant plus de contenu que nécessaire ce qui réduira la diversité de ces plateformes en empêchant les personnes peu aguerries aux nouvelles technologies d’y participer » here)

In summary, Article 13:

  • would not improve the rightholder’s situation with respect to the big platforms, since these already have deployed content detection and revenue sharing systems;
  • would not improve, either, the rightholder’s situation with respect to non-profit or low traffic platforms, which don’t have the ability to operate complex detection systems, don’t violate protected works other than accidentally thus in a limited way, and are already in position to remove illegal content.
  • represents, on the other hand, the following risks:
    • arbitrary censorship
    • reinforcement of the hegemony of big platforms by introducing significant barriers to entry
    • disappearance or marginalization of non-profit platforms, or fallback of these platforms on static content, removing the content sharing angle which is a key characteristic of the Internet;
  • represents, as well, serious risks regarding Freedom of Speech and Cultural Diversity.

For the above reasons, and as expressed by numerous organizations and renowned experts, it seems likely that Article 13, if kept in the directive, will do more harm than good on the European Internet.

A few references

The Open Letter on EP Plenary Vote, of which (as eriomem.net CEO) I am a signatory:

http://copybuzz.com/wp-content/uploads/2018/07/Copyright-Open-Letter-on-EP-Plenary-Vote-on-Negotiation-Mandate.pdf

2 articles (amongst many others) on Julia Reda’s blog :

Open letter by 70 Internet experts https://www.eff.org/files/2018/06/12/article13letter.pdf

Positions of the EFF (Electronic Frontiers Foundation) https://www.eff.org/deeplinks/2018/06/internet-luminaries-ring-alarm-eu-copyright-filtering-proposal

https://www.eff.org/deeplinks/2018/06/eus-copyright-proposal-extremely-bad-news-everyone-even-especially-wikipedia

Other sites campaigning against Article 13:

https://www.liberties.eu/en/news/delete-article-thirteen-open-letter/13194

https://saveyourinternet.eu/

Statement by the Wikimédia Foundation:

https://blog.wikimedia.org/2018/06/14/dont-force-platforms-to-replace-communities-with-algorithms/

No tips yet.
Be the first to tip!

Like this post? Tip me with bitcoin!

1Nb4aJWgAUAqMUCzeF2vTTDUNVNTM5ak42

If you enjoyed reading this post, please consider tipping me using Bitcoin. Each post gets its own unique Bitcoin address so by tipping you're also telling me what you liked, in addition to contributing to the blog hardware and electricity, and perhaps a few beers if you don't mind 🙂

Bad idea: Gmail now discriminates against mail servers without an IPv6 reverse

This new gem is from the SMTP Gmail FAQ at https://support.google.com/mail/answer/81126?hl=en

(Fun note: they call it the “Bulk Senders Guidelines”… hence apparently anyone running their own personal mail server falls in that category…)

“Additional guidelines for IPv6

 

  • The sending IP must have a PTR record (i.e., a reverse DNS of the sending IP) and it should match the IP obtained via the forward DNS resolution of the hostname specified in the PTR record. Otherwise, mail will be marked as spam or possibly rejected.
  • The sending domain should pass either SPF check or DKIM check. Otherwise, mail might be marked as spam.”

I happen to be running my own mail server, and I happen to also be IPv6-connected, and finally I happen to be lacking a reverse DNS delegation for IPv6 because my ISP (Free) didn’t yet bother providing me with one.

I’m happier than most as my mail is sent through the eu.org server, which happens to get its mail accepted by Gmail. But it ends up tagged as “spam”.

I’m not the only one in France. OVH is reported as having the same problem.

So what are my points?

  • obviously, my ISP should provide me with a correctly delegated IPv6 reverse… at some point, of course the sooner would be the better.
  • but, as has been determined for over 15 years now with IPv4, refusing mail based on a lacking reverse delegation is counter-productive… since spammers statistically tend to send spam from hosts with a reverse more often than legitimate users!
  • so measures like the above end up bothering legitimate users more than spammers.

So I hope Google will step back on this one, whether or not the reverse problem gets fixed.

 

 

 

No tips yet.
Be the first to tip!

Like this post? Tip me with bitcoin!

1AqoNHFcdYEtFRvkRn7EVB5chU8jX3kVMj

If you enjoyed reading this post, please consider tipping me using Bitcoin. Each post gets its own unique Bitcoin address so by tipping you're also telling me what you liked, in addition to contributing to the blog hardware and electricity, and perhaps a few beers if you don't mind 🙂

IPv6 ICMP “packet too big” filtering considered harmful

If you intend to seriously run Internet servers or firewalls in the future (hence, IPv6 servers and firewalls), please read this.

This problem is so well-known, so old and yet still so unfixed and pervasive nowadays that, after pulling my hair for days on many hanging or time-outing IPv6 sessions, I felt I had to write this.

Executive summary: there are a huge number of sites with misconfigured firewalls who filter out “ICMP6 packet too big” packets. This breaks Path MTU discovery, causing hanging or broken IPv6 sessions.

Many sites unknowingly assume that the Internet MTU is at least 1500 bytes. This is wrong, whether in IPv4 or IPv6.

Many Internet hosts are connected through tunnels reducing the real MTU. Use of PPPoE for example, on ADSL links, reduces the MTU by a few bytes, and use of 6rd (“6 rapid deployment” tunneling) reduces it more than that. As 6rd is used extensively in France (Free ISP), this is a big problem.

1. The symptom: hanging IPv6 connections

Here’s a sample capture for a request where the server has more than 1 data packet.

08:39:57.785196 IP6 2a01:e35:8b50:2c40::7.39738 > 2001:xxx.43: S 165844086:165844086(0) win 65535 <mss 1440,nop,wscale 3,sackOK,timestamp 901

08:39:57.807709 IP6 2001:xxx.43 > 2a01:e35:8b50:2c40::7.39738: S 883894656:883894656(0) ack 165844087 win 14280 <mss 1440,sackOK,timestamp 2377433946 90108,nop,wscale 7>

08:39:57.808452 IP6 2a01:e35:8b50:2c40::7.39738 > 2001:xxx.43: .ack 1 win 8211 <nop,nop,timestamp 90132 2377433946>

08:39:57.808655 IP6 2a01:e35:8b50:2c40::7.39738 > 2001:xxx.43: P 1:9(8) ack 1 win 8211 <nop,nop,timestamp 90132 2377433946>

08:39:57.833052 IP6 2001:xxx.43 > 2a01:e35:8b50:2c40::7.39738: .ack 9 win 112 <nop,nop,timestamp 2377433972 90132>

08:39:57.888981 IP6 2001:xxx.43 > 2a01:e35:8b50:2c40::7.39738: P 1:1025(1024) ack 9 win 112 <nop,nop,timestamp 2377434026 90132>

(missing packet here : 1025:2453 containing 1428 bytes)

08:39:57.889315 IP6 2001:xxx.43 > 2a01:e35:8b50:2c40::7.39738: FP 2453:2723(270) ack 9 win 112 <nop,nop,timestamp 2377434027 90132> 08:39:57.890100 IP6 2a01:e35:8b50:2c40::7.39738 > 2001:xxx.43: .ack 1025 win 8211 <nop,nop,timestamp 90213 2377434026,nop,nop,sack 1 {2453:2723}>

(session hangs here, unterminated because of the missing bytes)

This is difficult to debug as modern Unices have a “TCP host cache” keeping track of Path MTUs on a host-by-host basis, causing the problem to suddenly disappear. in unpredictable ways depending on the size of transmitted data.

2. A sample successful session with working trial-and-error Path MTU discovery

10:09:55.291649 IP6 2a01:e35:8b50:2c40::7.40948 > 2a01:e0d:1:3:58bf:fa61:0:1.43: S 1032533547:1032533547(0) win 65535 <mss 1440,nop,wscale 3,sackOK,timestamp 5487603 0>

10:09:55.291787 IP6 2a01:e0d:1:3:58bf:fa61:0:1.43 > 2a01:e35:8b50:2c40::7.40948:S 3695299654:3695299654(0) ack 1032533548 win 65535 <mss 1440,nop,wscale 3,sackOK,timestamp 3185067848 5487603>

10:09:55.316234 IP6 2a01:e35:8b50:2c40::7.40948 > 2a01:e0d:1:3:58bf:fa61:0:1.43: . ack 1 win 8211 <nop,nop,timestamp 5487628 3185067848>

10:09:55.317965 IP6 2a01:e35:8b50:2c40::7.40948 > 2a01:e0d:1:3:58bf:fa61:0:1.43: P 1:9(8) ack 1 win 8211 <nop,nop,timestamp 5487628 3185067848> 10:09:55.417301 IP6 2a01:e0d:1:3:58bf:fa61:0:1.43 > 2a01:e35:8b50:2c40::7.40948: . ack 9 win 8210 <nop,nop,timestamp 3185067974 5487628>

Now the big packet that was missing in the broken session above:

10:09:56.084457 IP6 2a01:e0d:1:3:58bf:fa61:0:1.43 > 2a01:e35:8b50:2c40::7.40948: . 1:1429(1428) ack 9 win 8210 <nop,nop,timestamp 3185068641 5487628>

The 6rd gateway replies with an ICMP6 message:

10:09:56.085221 IP6 2a01:e00:1:11::2 > 2a01:e0d:1:3:58bf:fa61:0:1: ICMP6, packet too big, mtu 1480, length 584

Missing data is retransmitted by the server using a lower packet size (and an entry is created in the server’s host cache to remember that):

10:09:56.085489 IP6 2a01:e0d:1:3:58bf:fa61:0:1.43 > 2a01:e35:8b50:2c40::7.40948: . 1:1409(1408) ack 9 win 8210 <nop,nop,timestamp 3185068642 5487628> 10:09:56.085522 IP6 2a01:e0d:1:3:58bf:fa61:0:1.43 > 2a01:e35:8b50:2c40::7.40948: . 1409:1429(20) ack 9 win 8210 <nop,nop,timestamp 3185068642 5487628>

Then the connection goes on to correct completion (no use showing the packets here).

Interestingly, trying an identical request then shows that the MSS negotiation takes the host cache into account, with a MSS set to 1420 instead of 1440 from the start in the server reply:

10:10:14.053218 IP6 2a01:e35:8b50:2c40::7.20482 > 2a01:e0d:1:3:58bf:fa61:0:1.43: S 2231600544:2231600544(0) win 65535 <mss 1440,nop,wscale 3,sackOK,timestamp 5506365 0>

10:10:14.053382 IP6 2a01:e0d:1:3:58bf:fa61:0:1.43 > 2a01:e35:8b50:2c40::7.20482: S 2676514636:2676514636(0) ack 2231600545 win 65535 <mss 1420,nop,wscale 3,sackOK,timestamp 1128201317 5506365>

3. The simple fix

The fix is dead simple: just make sure that your filters are configured so that ICMP6 “packet too big”, type number 2, messages are correctly transmitted end-to-end, and correctly handled.

 

No tips yet.
Be the first to tip!

Like this post? Tip me with bitcoin!

1PvzA5ez6EFZUhqm1mVy7q581ryRypnR3K

If you enjoyed reading this post, please consider tipping me using Bitcoin. Each post gets its own unique Bitcoin address so by tipping you're also telling me what you liked, in addition to contributing to the blog hardware and electricity, and perhaps a few beers if you don't mind 🙂

What to do on June 6th / IPv6 Launch day?

June 6th, 2012 is the “World IPv6 Launch” day: see http://www.worldipv6launch.org/

As it stands, it is presented as mainly oriented toward ISPs and hardware makers, giving the impression that home users are not concerned.

Actually IPv6 has begun deployment years ago, but has failed so far to be on the radar of most organizations, slowing its adoption.

So let’s get things straight, you can participate from your home:

  • if your home ISP doesn’t yet provide you with IPv6 connectivity yet, he will have to, in the not-too-distant future. Call them and ask them when!
  • if your home ISP does already provide you with IPv6, activate it on your Internet connection and on your computer! In France, Free Telecom and Nerim already have been providing IPv6 connectivity for years.
  • if you run a personal server, activate IPv6 on it if available, and if not, ask for support!

It may be a little too soon to pester mobile phone operators (3G and 4G) to get IPv6 connectivity from them. They are telcos, after all… but if you feel like it, don’t hesitate to ask them, too, what their IPv6 deployment schedule is.

For French users, the G6 association has a nice set of resources on IPv6: http://g6.asso.fr/

No tips yet.
Be the first to tip!

Like this post? Tip me with bitcoin!

1NXLzMYPVzJyfkzmGDF7Grcw2xF2cm9FEY

If you enjoyed reading this post, please consider tipping me using Bitcoin. Each post gets its own unique Bitcoin address so by tipping you're also telling me what you liked, in addition to contributing to the blog hardware and electricity, and perhaps a few beers if you don't mind 🙂

Lossless import of MPEG-TS or HDV video files to iMovie

Here’s a little trick I learned and wanted to share. As it’s not complete, comments and additional hints are welcome!

The problem

I have a Canon HDV camcorder with many hours of HDV video. HDV is mpeg2-compressed video with a bitrate of about 25 Mbps.

I also have a MacOS X computer where I can run iMovie, Apple’s consumer-grade video editing application.

The camcorder video can be easily imported to FreeBSD using the built-in fwcontrol tool. It generates MPEG-TS files (mostly like IP TV channels) which read nicely in vlc, mplayer and other video tools. It’s easy and reliable.

The video can also be imported directly from the camcorder to iMovie, but it is painful and not adapted to easy archiving of the rushes. The import process is slow and buggy and you often have to rewind the tape and restart it.

I wanted to get the best of both worlds — fwcontrol’s easy import followed with iMovie editing.

But iMovie doesn’t know how to directly import MPEG-TS files. It can only import video from .mov (Quicktime) or .mp4 (MPEG4) containers. It’s difficult to know which video codecs are supported by iMovie but it seems to accept MPEG2, which means it can losslessly import HDV files, it’s just a matter of converting their container format from MPEG-TS to Quicktime.. It saves us from the slow, error-prone, lossy and painful process of transcoding.

So how do you do that?

The (mostly complete) solution

Here’s the incantation that mostly works for me. input.mpg is my MPEG-TS file; it can come from a fwcontrol import or from a IPTV capture (Freebox file for example); output.mov is the resulting Quicktime-container file:

ffmpeg -i input.mpg -acodec copy -vcodec copy output.mov

On my server (a double-core Intel Atom D525 processor with SATA disks, ie not a very fast machine) it converts at about 80-100 frames per second (3x to 4x real time), which is very fair (IO bound probably) and 12 to 20 times faster than transcoding the video.

From an IPTV capture you may have to explicitly transcode audio to AAC using -acodec libvo_aacenc instead.

Your second-best bet if the above doesn’t work is to let ffmpeg make a (much slower) almost-lossless transcoding to MPEG4, using option -sameq, yielding a bigger file (was almost twice as big as the original in my trials):

ffmpeg -i input.mpg -acodec copy -sameq output.mov

It works, but…

Why do I say it mostly works? Because there are two remaining gotchas:

  1. the original video timestamps (date and time of the video) are lost and set to the date and time of the conversion process — it’s constant and doesn’t even increment throughout the file duration. It is probably a ffmpeg bug. I tweaked the import with -copyts option but this apparently handles the time index from the camcorder (duration from the beginning of the tape). This may (or may not) be related to the following error message from ffmpeg: [NULL @ 0x806c1d920] start time is not set in av_estimate_timings_from_pts
  2. iMovie doesn’t seem to grok huge files. It works for a couple hundred megabytes, but not for a couple gigabytes. So you may have to split files take by take, and I don’t know how to do that easily, especially given the above regarding broken timestamps.

Thanks to Benjamin Sonntag for the excellent idea of using ffmpeg for this 😉

Comments and especially clues/solutions more than welcome 😉

No tips yet.
Be the first to tip!

Like this post? Tip me with bitcoin!

1LQJBAnmqSSwDaFCAuoG6iHxZ6LiAPZ69o

If you enjoyed reading this post, please consider tipping me using Bitcoin. Each post gets its own unique Bitcoin address so by tipping you're also telling me what you liked, in addition to contributing to the blog hardware and electricity, and perhaps a few beers if you don't mind 🙂

TCP-Estimated round-trip test

In an attempt to evaluate different methods for measuring the performance of a TCP/IP connection, I’ve bumped into FreeBSD‘s getsockopt(TCP_INFO) system call, cloned from a similar call invented by Linux, which kindly returns interesting data about the current TCP connection.

I was mainly interested about round-trip time (RTT, called tcpi_rtt) and its standard deviation, mistakenly called tcpi_rttvar even though it’s not a variance.

I’ve written a small proof-of-concept tool accessible at http://eu.org:4500/ to display operating system information retrieved from the current HTTP access. The page currently runs on a FreeBSD 9-CURRENT machine; feel free to try it out, it works either in IPv4 or IPv6. Here’s a sample as of today:

This experimental page displays raw system TCP estimates, in microseconds.

Address: 2a01:e35:8b50:2c40::4
Estimated round-trip time: 15437
Estimated standard deviation: 27937

Note that the measures are very rough. First, the real resolution is about 1 millisecond (one kernel tick), not 1 microsecond. Then, several RTT samples are smoothed into the provided values, with a bigger weight for more recent samples. I left the actual values obtained from the kernel, for clarity, even though giving them up to a 1 microsecond resolution is somewhat misleading.

Then, of course, the results also depend on the number of samples, which tends to be low: the above page waits for the client HTTP headers to be fully received, then emits its own headers in reply, then waits for one second to give some time for the TCP ack(s) to come back, then displays the then-current estimations.

The results are probably sufficient for TCP’s internal needs, but they may differ wildly from real RTT values. Plus, the real RTT value depends on packet size, which TCP doesn’t seem to take into account. The above example is taken from my local network and displays over 15 ms for the RTT, whereas the real RTT is well below 1 ms (0.23 min, 0.4 average with 0.01 standard deviation, according to ping). The results are not always wildly up, I’ve noticed the opposite effect from a remote mobile phone displaying ~100 ms whereas the ping time was more like ~200 ms…

Feel free to use it and add comments below.

No tips yet.
Be the first to tip!

Like this post? Tip me with bitcoin!

1JkS4gdxzf3exHnLf3ULyfw6vHnPESvp3A

If you enjoyed reading this post, please consider tipping me using Bitcoin. Each post gets its own unique Bitcoin address so by tipping you're also telling me what you liked, in addition to contributing to the blog hardware and electricity, and perhaps a few beers if you don't mind 🙂

Buzz: EPIC files a complaint

EPIC Urges Federal Trade Commission to Investigate Google Buzz.

EPIC fail“, as Arstechnica nicely puts it.

PDF of the detailed complaint, a must-read IMHO, which is the best summary to date of what the fuss is all about, including point by point discussion of the opt-out issues.

[followup article to Google Buzz privacy debacle, Let the second Buzz effect begin and Google Buzz start-up requesters update]

No tips yet.
Be the first to tip!

Like this post? Tip me with bitcoin!

1NHUrbCZ8VFdp8qQiLFtkKCcYgkYQZuYVJ

If you enjoyed reading this post, please consider tipping me using Bitcoin. Each post gets its own unique Bitcoin address so by tipping you're also telling me what you liked, in addition to contributing to the blog hardware and electricity, and perhaps a few beers if you don't mind 🙂

If you type Google into Google…

If you type “google buzz“, “google earth” or “google maps” (and some others, probably) into Google search, you get a live, scrolling search entry in the middle of your results page, including twitter feeds.

Why not, but feels a bit weird, and you have to be fast if you want to catch one of the scrolling links. That’s not for grandma.

No tips yet.
Be the first to tip!

Like this post? Tip me with bitcoin!

1Ka4ZwgDWYZqwac9F8xA5z5Ker4K8CwyyR

If you enjoyed reading this post, please consider tipping me using Bitcoin. Each post gets its own unique Bitcoin address so by tipping you're also telling me what you liked, in addition to contributing to the blog hardware and electricity, and perhaps a few beers if you don't mind 🙂

Google Buzz start-up requesters update

Apparently the Google Buzz team is very hard at work (on a week-end break!) to fix some of the more blatant problems with Buzz initialization, see their post titled A new Buzz start-up experience based on your feedback; including a nice apology at the end regarding the panic.

My gut feeling is that there are still way too many opt-out things. List visibility, for one: opt-out still does not feel quite right to me; it is too easy for people to not care at first and regret it only when it’s too late. The obvious fact that leaked information can’t be unleaked should be taken into account.

Also, one of the core problems, clearly separating contacts vs friends, apparently remains. I’ll need to retry Buzz someday to see how it feels after this update.

Update: see also Google Buzz – anatomy of a slow motion train wreck, a very good analysis of what happened and things to expect. I share the feeling about the shift in privacy habits that Google (or should I say, some people at Google, but from where I stand that is irrelevant) are trying to shove down our throats.

Update: the mainstream press is getting angry, too. Furious John Naughton article in The Guardian, quoting:

In the real world, the devil is in the details. In cyberspace, it’s in the defaults. And the default settings in Buzz are so crass that one cannot imagine they are the product of corporate carelessness.

The Google boys are smart and know exactly what they’re doing. They’ve been enviously watching the stupendous growth of Twitter and Facebook and wondering how Google can cut them off at the knees before they become really unstoppable – which brings us back to Microsoft.

Update: another article, Buzz: Google Needs Better ‘People Skills’: (this one I find slightly unfair, albeit not totally undeserved)

Given the option, Google’s choice for default settings were what benefited Google the most, not what best protected its consumers.[…]

Privacy, however, impacts everything Google does. That the company could get Buzz privacy so terribly wrong is reason for serious concern.

Google needs to learn when to put people first and technology second.pdate:

Continue reading Google Buzz start-up requesters update

No tips yet.
Be the first to tip!

Like this post? Tip me with bitcoin!

16A4Jy96xDd5aV57P8PM61WL9PWG2tVMs4

If you enjoyed reading this post, please consider tipping me using Bitcoin. Each post gets its own unique Bitcoin address so by tipping you're also telling me what you liked, in addition to contributing to the blog hardware and electricity, and perhaps a few beers if you don't mind 🙂

Let the second Buzz effect begin

Ok, so after Google auto-subscribed about 10 million people to Buzz a few days ago to get traction, sit down and watch real-time mass unsubscription of upset people, including some vocal ones.

No tips yet.
Be the first to tip!

Like this post? Tip me with bitcoin!

17DXS5m1tvcML1AU6vVjNa4HiCmHL11Z7G

If you enjoyed reading this post, please consider tipping me using Bitcoin. Each post gets its own unique Bitcoin address so by tipping you're also telling me what you liked, in addition to contributing to the blog hardware and electricity, and perhaps a few beers if you don't mind 🙂