key indexing weirdness

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

key indexing weirdness

Jonathon Weiss

Hi All,

It was pointed out to me that searching for the full name associated
with key 0xF5832DD7C85CC5AD did not find the key, though searching for
just the first name did.  Actually searching for just the last name
seems to fail too.  The random 2 other SKS servers I checked seem to
have the same problem too.  Any idea what would cause this?

        Jonathon

        Jonathon Weiss <[hidden email]>
        MIT/IS&T/OIS  Server Operations


_______________________________________________
Sks-devel mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/sks-devel
Reply | Threaded
Open this post in threaded view
|

Re: key indexing weirdness

Dinko Korunic
On Wed, Aug 05, 2009 at 05:32:39PM -0400, Jonathon Weiss wrote:
> It was pointed out to me that searching for the full name associated
> with key 0xF5832DD7C85CC5AD did not find the key, though searching for
> just the first name did.  Actually searching for just the last name
> seems to fail too.  The random 2 other SKS servers I checked seem to
> have the same problem too.  Any idea what would cause this?

Just tried on pks.aaiedu.hr which is running latest SKS from Minsky's hg
repo (1.1.0) and search finds 0xF5832DD7C85CC5AD normally.

--
NAME:Dinko.kreator.Korunic   DISCLAIMER:Standard.disclaimer.applies
ICQ:16965294        JAB:[hidden email]        PGP:0xEA160D0B
HOME:http://dkorunic.net    QUOTE:Eat.right.stay.fit.and.die.anyway


_______________________________________________
Sks-devel mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/sks-devel
Reply | Threaded
Open this post in threaded view
|

Re: key indexing weirdness

Peter Pramberger
In reply to this post by Jonathon Weiss
Jonathon Weiss schrieb am 05.08.2009 23:32:
> It was pointed out to me that searching for the full name associated
> with key 0xF5832DD7C85CC5AD did not find the key, though searching for
> just the first name did.  Actually searching for just the last name
> seems to fail too.  The random 2 other SKS servers I checked seem to
> have the same problem too.  Any idea what would cause this?

Interesting, I can reproduce the problem.

While looking at the key, I noticed that it has only one uid without any email
address on it.

There is another key with the same characteristics: 0x1afcb4e4bc443d55. If you
search for its full name, you get only one result; but not the given one.
Leave the last word out, and you get both. Coincidence?


Br,
Peter



_______________________________________________
Sks-devel mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/sks-devel
Reply | Threaded
Open this post in threaded view
|

Re: key indexing weirdness

Peter Pramberger
Peter Pramberger schrieb am 06.08.2009 01:49:
> There is another key with the same characteristics: 0x1afcb4e4bc443d55. If you
> search for its full name, you get only one result; but not the given one.

Another sample: 0x658a12505103e44f.


Br,
Peter




_______________________________________________
Sks-devel mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/sks-devel
Reply | Threaded
Open this post in threaded view
|

[PATCH] Proper case handling for words index (was: key indexing weirdness)

Kim Minh Kaplan-4
In reply to this post by Jonathon Weiss
Jonathon Weiss:

> Hi All,
>
> It was pointed out to me that searching for the full name associated
> with key 0xF5832DD7C85CC5AD did not find the key, though searching for
> just the first name did.  Actually searching for just the last name
> seems to fail too.  The random 2 other SKS servers I checked seem to
> have the same problem too.  Any idea what would cause this?

It is a bug in SKS: during indexing it does not properly handle
uppercases in the last word of the user ID.  Here is the fix that is
currently running on my keyserver.

Kim Minh.


BUGFIX: The last word of a user id was not properly case converted.

Properly downcase every word of the user id for word indexing purpose.

Reported in http://lists.gnu.org/archive/html/sks-devel/2009-08/msg00003.html

diff -r 46c6aaac31da utils.ml
--- a/utils.ml Sat Mar 28 22:04:00 2009 -0400
+++ b/utils.ml Thu Aug 06 17:49:37 2009 +0200
@@ -100,16 +100,17 @@
 
 
 let rec extract_words_rec s ~start ~len partial =
+  let one () = Set.add (String.lowercase (String.sub s start len)) partial in
   if start + len = String.length s
   then ( if len = 0 then partial
- else Set.add (String.sub s start len) partial )
+ else one ())
   else (
     if is_alnum s.[start + len]
     then extract_words_rec s ~start ~len:(len + 1) partial
     else ( if len = 0
    then extract_words_rec s ~start:(start + 1) ~len partial
    else extract_words_rec s ~start:(start + len)  ~len:0
-     (Set.add (String.lowercase (String.sub s start len)) partial)
+     (one ())
  )
   )
 

_______________________________________________
Sks-devel mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/sks-devel
Reply | Threaded
Open this post in threaded view
|

[PATCH] Proper case handling for words index (was: key indexing weirdness)

Kim Minh Kaplan
In reply to this post by Jonathon Weiss
Jonathon Weiss:

> Hi All,
>
> It was pointed out to me that searching for the full name associated
> with key 0xF5832DD7C85CC5AD did not find the key, though searching for
> just the first name did.  Actually searching for just the last name
> seems to fail too.  The random 2 other SKS servers I checked seem to
> have the same problem too.  Any idea what would cause this?

It is a bug in SKS: during indexing it does not properly handle
uppercases in the last word of the user ID.  Here is the fix that is
currently running on my keyserver.

Kim Minh.


BUGFIX: The last word of a user id was not properly case converted.

Properly downcase every word of the user id for word indexing purpose.

Reported in http://lists.gnu.org/archive/html/sks-devel/2009-08/msg00003.html

diff -r 46c6aaac31da utils.ml
--- a/utils.ml Sat Mar 28 22:04:00 2009 -0400
+++ b/utils.ml Thu Aug 06 17:49:37 2009 +0200
@@ -100,16 +100,17 @@
 
 
 let rec extract_words_rec s ~start ~len partial =
+  let one () = Set.add (String.lowercase (String.sub s start len)) partial in
   if start + len = String.length s
   then ( if len = 0 then partial
- else Set.add (String.sub s start len) partial )
+ else one ())
   else (
     if is_alnum s.[start + len]
     then extract_words_rec s ~start ~len:(len + 1) partial
     else ( if len = 0
    then extract_words_rec s ~start:(start + 1) ~len partial
    else extract_words_rec s ~start:(start + len)  ~len:0
-     (Set.add (String.lowercase (String.sub s start len)) partial)
+     (one ())
  )
   )
 

_______________________________________________
Sks-devel mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/sks-devel
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Proper case handling for words index (was: key indexing weirdness)

Phil Pennock-17
On 2009-08-07 at 05:54 +0000, Kim Minh Kaplan wrote:
> It is a bug in SKS: during indexing it does not properly handle
> uppercases in the last word of the user ID.  Here is the fix that is
> currently running on my keyserver.

Hi Kim,

It's been a while since I looked at the code-base; does this require a
dump and reindex or will the index changes be picked up anyway?

Thanks,
-Phil


_______________________________________________
Sks-devel mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/sks-devel
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Proper case handling for words index

Kim Minh Kaplan
Phil Pennock writes:

> Hi Kim,
>
> It's been a while since I looked at the code-base; does this require a
> dump and reindex or will the index changes be picked up anyway?
>
> Thanks,
> -Phil

Ah yes I forgot to mention that you need to rebuild the keys and ptree
databases from scratch.  Is a command for manually (automatically?)
reindexing the database without rebuilding necessary?

Kim Minh.


_______________________________________________
Sks-devel mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/sks-devel
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Proper case handling for words index

Phil Pennock-17
On 2009-08-07 at 07:20 +0000, Kim Minh Kaplan wrote:

> Phil Pennock writes:
>
> > Hi Kim,
> >
> > It's been a while since I looked at the code-base; does this require a
> > dump and reindex or will the index changes be picked up anyway?
> >
> > Thanks,
> > -Phil
>
> Ah yes I forgot to mention that you need to rebuild the keys and ptree
> databases from scratch.  Is a command for manually (automatically?)
> reindexing the database without rebuilding necessary?

"Necessary", no.  Nice to have, yes.  But I'll just shut down, take a
copy, start back up, run dump/init on the new system and then switch
over later, letting changes re-fill back in.

(But not tonight or this weekend).

How much work would be involved in having live re-indexing?

-Phil


_______________________________________________
Sks-devel mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/sks-devel
Reply | Threaded
Open this post in threaded view
|

Re: Re: [PATCH] Proper case handling for words index

Christoph Anton Mitterer-2
Hi.

Are we going to see a new sks release in the near future? With all the
recent patches (IP6, DNS, this one, etc.)?
Perhaps including a "end-user" targeted guide how to recover from bugs
like this one (dump-restore-etc-procedure)?


Best wishes,
Chris.

_______________________________________________
Sks-devel mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/sks-devel

smime.p7s (6K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Proper case handling for words index

Kim Minh Kaplan
In reply to this post by Phil Pennock-17
Phil Pennock writes:

> Kim Minh Kaplan wrote:
>
>> Is a command for manually (automatically?)
>> reindexing the database without rebuilding necessary?
>
> "Necessary", no.  Nice to have, yes.  But I'll just shut down, take a
> copy, start back up, run dump/init on the new system and then switch
> over later, letting changes re-fill back in.
>
> (But not tonight or this weekend).
>
> How much work would be involved in having live re-indexing?

I am working on a little program for reindexing it looks simple enough.
It will still require shutdown, reindex, start back up.

Kim Minh.


_______________________________________________
Sks-devel mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/sks-devel
Reply | Threaded
Open this post in threaded view
|

Re: Re: [PATCH] Proper case handling for words index

Kim Minh Kaplan
This should correct the word index.

    $ ./fix_word_index
    2009-08-08 13:35:41 Fixing word index
    2009-08-08 13:38:04 Fixed 17247 words

I even believe you can run it without stopping the SKS db server but I
am not certain, feedback on this part would be very appreciated as I am
still investigating SKS's safety with regard to transactions.

Kim Minh.


add fix_word_index to repare the word database without rebuilding the whole database.

diff -r 912355056ea4 Makefile
--- a/Makefile Sat Aug 08 11:51:09 2009 +0200
+++ b/Makefile Sat Aug 08 13:25:01 2009 +0200
@@ -153,6 +153,9 @@
 sks.8: sks.pod
  pod2man -c "SKS OpenPGP Key server" --section 8 -r 0.1 -name sks sks.pod sks.8
 
+fix_word_index: $(LIBS) $(ALLOBJS) fix_word_index.cmx
+ $(OCAMLOPT) -o fix_word_index $(OCAMLOPTFLAGS) $(ALLOBJS) fix_word_index.cmx
+
 spider: $(LIBS) $(ALLOBJS) spider.cmx
  $(OCAMLOPT) -o spider $(OCAMLOPTFLAGS) $(ALLOBJS) spider.cmx
 
diff -r 912355056ea4 fix_word_index.ml
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/fix_word_index.ml Sat Aug 08 13:25:01 2009 +0200
@@ -0,0 +1,118 @@
+open Common
+open Bdb
+open Keydb
+
+let env = ref None
+let db_word = ref None
+
+let settings = {
+  Keydb.withtxn = !Settings.transactions;
+  Keydb.cache_bytes = !Settings.cache_bytes;
+  Keydb.pagesize = !Settings.pagesize;
+  Keydb.dbdir = Lazy.force Settings.dbdir;
+  Keydb.dumpdir = Lazy.force Settings.dumpdir;
+}
+
+let get_env settings =
+  match !env with
+    Some e -> e
+  | None ->
+      let e = Dbenv.create () in
+      try
+ (
+ match settings.cache_bytes with
+   None -> ()
+ | Some cache_bytes ->
+     Dbenv.set_cachesize e ~gbytes:0 ~bytes:cache_bytes ~ncache:0);
+ Dbenv.dopen e settings.dbdir
+  ([Dbenv.INIT_MPOOL; Dbenv.CREATE ]
+   @ (if settings.withtxn then [ Dbenv.INIT_TXN ]
+   else [] ) )
+  0o600;
+ env := Some e;
+ e
+      with
+ ex ->
+  if !env = None then Dbenv.close e;
+  raise ex
+
+let get_db settings =
+  plerror 5 "Opening word index";
+  match !db_word with
+    Some db -> db
+  | None ->
+      let dbenv = get_env settings in
+      let openflags = (
+ if settings.withtxn then [Db.CREATE; Db.AUTO_COMMIT]
+ else [Db.CREATE])
+      in
+      let db = Db.sopen ~dbenv "word" Db.BTREE
+  ~moreflags:[Db.DUPSORT] openflags 0o600
+      in
+      db_word := Some db;
+      db
+
+let close_db () =
+  plerror 5 "Closing word index";
+  (match !db_word with
+    Some db -> Db.close db
+  | None -> ());
+  db_word := None;
+  (match !env with
+    Some e -> Dbenv.close e
+  | None -> ());
+  env := None
+
+let fix_words_loop ~txn db c =
+  let rec loop n =
+    match (try Some (Cursor.get c Cursor.NEXT []) with Not_found -> None)
+    with
+      None -> n
+    | Some (key,fp) ->
+ let canon_key = String.lowercase key in
+ if key = canon_key
+ then loop n
+ else (
+  plerror 5 "Fix 0x%s %s -> %s" (Utils.hexstring fp) key canon_key;
+  (try
+    match txn with
+      None -> Db.put db canon_key fp [Db.NODUPDATA];
+    | Some txn -> Db.put db ~txn  ~key:canon_key ~data:fp [Db.NODUPDATA]
+  with
+    Bdb.Key_exists -> ());
+  Cursor.del c;
+  loop (n+1))
+  in
+  loop 0
+  
+let fix_words ?txn db settings =
+  plerror 0 "Fixing word index";
+  let txn =
+    ref (if settings.withtxn
+    then Some(Txn.txn_begin (get_env settings) txn [])
+    else None)
+  in
+  let count = protect
+      ~f:(fun () ->
+ let c =
+  match !txn with
+    None -> Cursor.create db
+  | Some txn -> Cursor.create ~txn db
+ in
+ let count = protect ~f:(fun () -> fix_words_loop !txn db c)
+            ~finally:(fun () -> Cursor.close c);
+ in begin
+  match !txn with None -> () | Some t -> Txn.commit t [];
+ end;
+ txn := None;
+ count)
+      ~finally:(fun () ->
+ match !txn with
+  Some tx -> Txn.abort tx
+ | None -> ())
+  in
+  plerror 0 "Fixed %d words\n" count
+
+let () =
+  let f () = fix_words (get_db settings) settings in
+  protect ~f ~finally:close_db

_______________________________________________
Sks-devel mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/sks-devel
Reply | Threaded
Open this post in threaded view
|

Re: Re: [PATCH] Proper case handling for words index

Jason Harris
On Sat, Aug 08, 2009 at 11:43:43AM +0000, Kim Minh Kaplan wrote:
> This should correct the word index.
>
>     $ ./fix_word_index
>     2009-08-08 13:35:41 Fixing word index
>     2009-08-08 13:38:04 Fixed 17247 words

I would suggest the output go to its own logfile, like update_subkeys:

  update_subkeys.ml:121:  set_logfile "update_subkeys";

and be verbose by default (assuming I got my extra output because I set
"debuglevel: 5" in sksconf):

  2009-08-10 11:50:40 0.090344 Opening word index
  2009-08-10 11:50:40 0.133568 Fixing word index
  2009-08-10 11:50:40 0.597982 Fix 0x65DA815FA6DE640C9B4CD96880F82E78 00A1130001 -> 00a1130001
  ...
  2009-08-10 11:50:46 0.842120 Fix 0x0351CEB2FE16722FDE690C481C85A795 A -> a
  ...
  2009-08-10 11:58:20 0.679230 Fix 0x0A79227FF0101731FF24EFDE1CEFE3AE alanLovejoy -> alanlovejoy
  ...
  2009-08-10 12:06:13 0.042849 Fix 0x5768AE9CF1614392A710D2FFB4EE538C ? -> ?
  ...
  2009-08-10 12:06:13 0.608868 Fix 0x9F5C194CF96FEC513559AE05A44A3213 ??? -> ???
  2009-08-10 12:06:13 0.755021 Fixed 17458 words
  2009-08-10 12:06:13 0.755180 Closing word index

> I even believe you can run it without stopping the SKS db server but I
> am not certain, feedback on this part would be very appreciated as I am
> still investigating SKS's safety with regard to transactions.

I removed Dbenv.RECOVER from my flags a long time ago and running
./fix_word_index in parallel with "sks db" seemed to work fine.

Deadlock can still easily occur, however, since SKS doesn't yet do
full deadlock setup/handling in the BDB API, IIRC.

BTW, I would like to pull these patches directly from your repo. for
better tracking.  Thanks.

--
Jason Harris           |  NIC:  JH329, PGP:  This _is_ PGP-signed, isn't it?
[hidden email] _|_ web:  http://keyserver.kjsl.com/~jharris/
          Got photons?   (TM), (C) 2004

_______________________________________________
Sks-devel mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/sks-devel

attachment0 (322 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Proper case handling for words index (was: key indexing weirdness)

Dinko Korunic
In reply to this post by Kim Minh Kaplan
On Fri, Aug 07, 2009 at 05:54:47AM +0000, Kim Minh Kaplan wrote:
> uppercases in the last word of the user ID.  Here is the fix that is
> currently running on my keyserver.

Applied to pks.aaiedu.hr too. Seems to be working, search for names
associated with mentioned keys on list now successfuly finds keys.

However, I haven't tried fix-in-place utility; I've rebuilded keys and
ptree database from scratch, since it was good opportunity to switch to
BDB 4.7.


D.

--
NAME:Dinko.kreator.Korunic   DISCLAIMER:Standard.disclaimer.applies
ICQ:16965294        JAB:[hidden email]        PGP:0xEA160D0B
HOME:http://dkorunic.net    QUOTE:Eat.right.stay.fit.and.die.anyway


_______________________________________________
Sks-devel mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/sks-devel
Reply | Threaded
Open this post in threaded view
|

Re: Re: [PATCH] Proper case handling for words index

Kim Minh Kaplan
In reply to this post by Jason Harris
Jason Harris:

> I would suggest the output go to its own logfile, like update_subkeys:
>
>   update_subkeys.ml:121:  set_logfile "update_subkeys";
>
> and be verbose by default (assuming I got my extra output because I set
> "debuglevel: 5" in sksconf):

This fixup script does not really deserve to be maintained.  Once the
database is fixed its of no more use.  I hope we will quickly forget
about it.

> I removed Dbenv.RECOVER from my flags a long time ago and running
> ./fix_word_index in parallel with "sks db" seemed to work fine.

Nice.  By the way does removing Dbenv.RECOVER permit you to dump your db
live, without stopping the sks server?

> Deadlock can still easily occur, however, since SKS doesn't yet do
> full deadlock setup/handling in the BDB API, IIRC.

Right.  In this case one would have to launch it again.  Mmm, I am not
sure a deadlock would be autodected.  OK, you should stop sks db while
running this script.  Or run the Berkeley DB deadlock detection utility.

> BTW, I would like to pull these patches directly from your repo. for
> better tracking.  Thanks.

I am very hesitant to fork SKS.  But as Yaron Minsky has been very quiet
these last months it could be the only solution.  What do people think
about this here?

Kim Minh.


_______________________________________________
Sks-devel mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/sks-devel
Reply | Threaded
Open this post in threaded view
|

Re: Re: [PATCH] Proper case handling for words index

Jason Harris
On Wed, Aug 12, 2009 at 12:54:43PM +0000, Kim Minh Kaplan wrote:

> This fixup script does not really deserve to be maintained.  Once the
> database is fixed its of no more use.  I hope we will quickly forget
> about it.

It is good OCaml sample code, at least.  I plan to keep it around in my
cloned repo.

> Nice.  By the way does removing Dbenv.RECOVER permit you to dump your db
> live, without stopping the sks server?

Definitely!

> I am very hesitant to fork SKS.  But as Yaron Minsky has been very quiet
> these last months it could be the only solution.  What do people think
> about this here?

Clearly, these patches are needed and should be made available in at
least in one public repo...

We should use the full power of mercurial for distributed development.
Most (all?) of the contributed patches added to SKS since switching to
hg are not attributed properly/fully via hg pulls, which is a shame.
Adding them to personal repos should be done first, and hg patchsets
published/emailed/pulled to track authorship.  Publishing plain old
patches and making everyone import/attribute them manually into their
own hg repo. is dumb.

To facilitate this, and to track patchset usage on various servers,
we need to at least add the hg changeset to the SKS version string.
The last official SKS is then known as 1.1.0@hg:46c6aaac31da, even
if it was pulled from your repo. and not Yaron's.

--
Jason Harris           |  NIC:  JH329, PGP:  This _is_ PGP-signed, isn't it?
[hidden email] _|_ web:  http://keyserver.kjsl.com/~jharris/
          Got photons?   (TM), (C) 2004

_______________________________________________
Sks-devel mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/sks-devel

attachment0 (322 bytes) Download Attachment