regex - Outputting special characters correctly (unicode) in PERL -


I'm trying to find all the file names in the directory and determine which names are special characters. I use regex

  / [^ - a-zA-Z0- 9 _.] /   

sample files (I created using Touch) I am:

  pdf-2014a 014 and 7_06_64-os_o and l, _Inc.Pdf pdf-20_06_04-O_OnLine, _Inc.PDF pdf-20_0_0-Utà _d.wr.pdf pdf-20_12_28 -20.Mga_Grf Fwd_Notice_KDJFI789 & amp; _JFK38.pdf pdf-2_0_0-C_â ???? _DUKE.pdf pdf-2_1_3-f_s-M_F_D and A.pdf pdf _-_ 401014 and 1007_0617_06264 -O_O and L, _Inc. PDF could produce the correct name before matching the name of the pattern in regex. Yes, Pearl was able to match the special character in any way, but when the character changes Come on.  
  * pdf-2_0_0-c_a ???? _dukem.pdf & gt; Pdf-2_0_0-c _ _ _ _ DUKE.pdf   

I can try to adopt this line

  # binmode (STDOUT, ": Utf8 ");   

And then run the command script. Sure? The points will be removed but the output will also be different.

  * pdf-2_0_0-c_a ???? _dukem.pdf & gt; Pdf-2_0_0-C_Ã ¢ _DUKE.pdf   

Here's my code:

  Use strict; Use warnings; File :: Use Search; Use CWD; #Binmoda (STDOUT, ": utf8"); My $ start_directory = cwd (); Use the term: ANSIColor; CheckForSpecialChar (CWD ()); Sub-check for special class {my ($ source_dir) = @_; Chdir $ source_dir or die qq ("$ source_dir" can not be changed); ({{- ^ - a- zA-Z0-9 _.]) / G) {chomp ($ _); Print "" $. "Print color 'reset'; print $ '." "\" "\" "$." "$ _" | \ N "; Print `` print color 'bold red'; Print "$ 1"; "); Chdir (" $ start_directory ");   

Any ideas people?

UPDATE: Hmm, you guys look right like someone with regex There is no problem Hi Ekholland, I tried to change the code to look like you look but you still produce the same problem with a hyphen and a small letter with a grave. Binmode (STDOUT, "UTF 8") Instead of using me, I give a `` instead of a small grave; AA ?? By using binmode (STDOUT, "utf8"); Strict use; Use warnings; File :: Use search; Use CWD; Use encode; Binomod (STDOUT) , "UTF 8"); My $ start_directory = cwd (); Use the term: ANSIColor; checkForSpecialChar (CWD ()); Sub Check for Special Class {My ($ source_dir) = @_; Chdir $ source_dir or die qq ("$ Source_dir" can not be changed); -f; unless sub (return); We want to print files only $ $. "\ N"; $ _ = encoded :: decod_ta 8 ($ _); (my $ counter = 0; $ counter & lt; Length ($ _); $ Counter ++) {print encoded :: encod_utf 8 (substring ($ _, $ counter, 1)) ". \" "}}}," ");

chdir (" $ start_directory ");} Without Partitioning Output (STDOUT, "UTF8"); Pdf-2_0_0-c_a ???? _duke.pdf pdf - 2 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 2014 A014 and 1007_0617_06264-O_O and L, _Inc.pdf pdf _ - _ 2 0 1 4a Ì ?? 0 1 4 and 1 0 7 _ 0 6 1 7 _ 0 6 2 6 4 - O_O & L, _InPF < / Div>

You have to decode it on the way and encode it on exit. Something like this:

  use encode; search (sub {$ _ = encoded :: decode_taf 8 ($ _); while (M / ([^ - a-zA-Z0-9 _.]) / G) {my $ Chr = encode :: encode_utf8 ($ 1); Print "$ chr \ n"}}, ".");    

Comments

Popular posts from this blog

java - ImportError: No module named py4j.java_gateway -

python - Receiving "KeyError" after decoding json result from url -

.net - Creating a new Queue Manager and Queue in Websphere MQ (using C#) -