Cyclone 1.6b1

An Interface for Apple Text Encoding Converter.

Abracode, Inc.

“Butterflies stir a breeze
and the ripples flow unceasingly:
far away the cyclones swirl.
It's a whole, connected world.”


A short description for Read Me haters

Cyclone is a text converting utility that uses Apple Text Encoding Converter.
Highlights include:

Version history

Details for the curious

Cyclone is free for any use (except for abuse).
Any distribution is encouraged but Abracode retains copyright for the program and does not wish to see any modified/incomplete versions distributed.

Cyclone Requirements:

Theory of operation
Text Encoding Converter (called TEC) is a Mac OS engine for handling different languages using different character sets. It supports many standards, it is robust and pretty fast. Many applications use it for their internal conversion needs and that's great but I could not seem to find a plain converter using this engine. So here comes the Cyclone.

Highlights revisited
Because Cyclone is using TEC's conversion maps, it will grow with TEC even if the program itself will not be developed. When more encodings appear in future incarnations of TEC or any maps are corrected or modified, Cyclone is supposed to use them as if nothing has changed (OK — one exception — I hard-coded the names of encodings, because I was not satisfied with the names returned by TEC, but if Cyclone will not find the name for any new encoding in its own resources, it will use the name given by TEC). TEC does not change line endings properly so I added this option (any bugs in this field are mine), look “More details” section for specifications.
Cyclone can convert many files dragged at it or chosen from standard file dialog.

When you look at the conversion dialog you will see the two sets of pop-ups, left for input, right for output. Choose the standard/platform first, then specific encoding and lastly the variant (if any variant for the given encoding exists). You may choose whatever you want for input and output encodings, but you must be aware that not all conversions make sense — you cannot translate from Chinese to Greek with TEC (not yet :-)). Sometimes you will get an error, but sometimes not. You are responsible for choosing a valid encodings for input and output. You may use content sniffers, which can help with input encoding (look “More details” section for description of sniffers), but do not rely on it.

I implemented the following options to make my life easier (and hopefully yours too):

Multiple file settings

Two little features:

More details for the very curious

Content sniffers
Content sniffing is a feature offered by TEC and used by Cyclone when checked in preferences.
When this option is active, Cyclone tries to suggest what input encoding is used. Unfortunately in current TEC version (1.5) can guess content ONLY for far-east languages. So if you are using these languages frequently, this option is for you. Otherwise you will be annoyed that Cyclone (or TEC, to be precise) suggests Chinese or Japanese every time you want to convert a plain ASCII.
This option is turned off by default.
Content sniffing is not working correctly.
I do not use it and people seem not to care about it — this is why it is not fixed yet.

Sniffers available in TEC 1.4.3 and 1.5 (in order of appearance):



Line Breaks
As mentioned before, TEC does not change the line breaks to match the output standard. For example when you convert from Mac to Windows, everything is converted OK except for line endings, which remain in Mac standard. So the option to change the line breaks has been added. Here are the rules for output standards when "Match output standard" option is chosen:

In Cyclone version 1.5 and up you may also specify line breaks explicitly by choosing "Ask" option in preferences. In this case a new pop-up is added to conversion dialog allowing you to set line breaks manually.

Unicode and HTML
HTML writers please note, that if you are building a page where most (or all) characters are ASCII, the encoding of choice for you is Unicode UTF-8. If all characters are ASCII, the length of your page will be exactly the same as if no Unicode is used.
To inform a browser that the Unicode UTF-8 is used, type:
<META HTTP-EQUIV="content-type" CONTENT="text/html;charset=UTF-8">
between <HEAD> and</HEAD> at the beginning of your file.
You should not use the PS = 0xE280A9 as a line break because most browsers do not support it.

More Unicode notes
The registered type for standard Unicode (UTF-16) text is 'utxt' (used for file and clipboard), while plain 8-bit text uses 'TEXT'. You may not be able to see the content of the clipboard or paste it if the application you use does not support Unicode. Unicode UTF-8 and UTF-7 remain 'TEXT'.
Each standard Unicode (UTF-16) text produced by Cyclone has a byte-order mark (0xFEFF) at the beginning to ensure 100% portability.

Beginning with version 1.1 “Cyclone” is scriptable via AppleScript. Please see the sample scripts provided in “Scripting” folder. A document entitled “Encodings Dictionary” contains predefined encoding names which can be used in scripts.
Available AppleScript commands:

Beginning with version 1.3 you may pass an Interent name for encoding.
This option is available with any “convert” command: “convert”, “convert text”, “convert clipboard”:

For a complete list of available encodings and their Internet name equivalents look “Encoding Dictionary”.

Setting Options:

Cyclone 1.5 adds support for optional setting of line breaks in exported file.
This option is available with any “convert” command: “convert”, “convert text”, “convert clipboard”:
The following sample demonstrates the syntax:

If not set explicitly, NoChangeLineBreaks is assumed so it is safe not to put "with" and the following parameter.
This way your old scrips will work without any change.

The available options are:

The future
Beginning with version 1.6 the sources are opened and developers are welcome to submit code additions.

Small print
The author gives no warranty for this software and takes no responsibility for any damages that it may cause. If you cannot accept it, please delete your copy.
All trademarks are properties of their owners.

* the quotation is from Peter Hammill (“Gaia”).