File:  [parser3project] / parser3 / src / doc / string.dox
Revision 1.7: download - view: text, annotated - select for diffs - revision graph
Wed Sep 12 14:30:59 2012 UTC (13 years, 8 months ago) by moko
Branches: MAIN
CVS tags: release_3_5_1, release_3_5_0, release_3_4_6, release_3_4_5, release_3_4_4, release_3_4_3, HEAD
doxygen.cfg and footer.htm updated for doxygen 1.7.3
targets.dox and string.dox slightly actualized

/**	@page String String

In memory strings [String] are stored as letters [actually, cord objects from libgd cord library, see cord.h file] 
plus list of fragments [String::Languages] which contain language of fragment+it's length.

Fragments that are received from stdout of scripts are considered clean(String::Language ::L_CLEAN),
but those from visitor - from stderr of scripts, from environment, from forms,
from dist [table:load] or from sql server [table:sql] are considered tainted(String::Language::L_TAINTED),
when string involved in different operations, it can be split, but all resulting parts 
still remember languages of their fragments.
String can be written to Request::wcontext, then part of fragments can change their language. 
Language is set to those fragments, which were not in particular language, but just tainted[L_TAINTED], 
they become "dirty, but we need what to do to make them clean, that is we know their language".
Say,
@verbatim
^void:sql{insert into news (title) values ('$form:title')]
@endverbatim
when parameter of sql is processed, with help of Temp_lang 
we set "current language" [Request::flang], and when string got written [Request::write_assign_lang] 
L_TAINTED string from $form:title part of parameter of sql-method, the one in apostrophes, got assigned an L_SQL language.

String can be converted to normal string using String::cstr().
If it is called with this parameter String::cstr(String::L_UNSPECIFIED) then
when it will be converted, fragment languages would be taken into consideration, and 
corresponding cleansing performed.
Particular language can also be specified String::cstr(String::Language) [by default =L_AS_IS], 
then all string fragments would forcibly considered to be in that language, 
regardles of the language inside fragment.

This is used, for instance, in work with file names
[ATTENTION: never ever use this construction if you care about your secret files,
it is used only as an example]:
@verbatim
$file[^table::load[$form:file_name]]
@endverbatim
here with normal $form:file_name processing would be L_HTML|L_OPTIMIZE_BIT, while we need L_FILE_SPEC,
and it would be stupid everywhere to do like we do in table:sql, insisting on {} parameters.

Usual language of output is String::Language::L_HTML|String::Language::L_OPTIMIZE_BIT, 
exception is CGI script, which got started outside of web-server,
in that case language String::Language::L_AS_IS is used.

In fragments where language is marked as OPTIMIZED (String::Language::L_OPTIMIZE_BIT), 
when they are converted into string String::cstr white spaces would be optimized:
out of several consequent characters would be left only first, others would be wiped off the result.

When code works with char* it is assumed it is can never be 0.
k
*/

E-mail: